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(57) Abstract: The present invention features 
Ad6 vectors and a nucleic acid encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide 
containing an inactive NS5B RNA -dependent RNA 
polymerase region. The nucleic acid is particularly 
useful as a component of an adenovector or DNA 
plasmid vaccine providing a broad range of antigens 
for generating an HCV specific cell mediated immune 
(CMI) response against HCV. 
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TITLE OF THE INVENTION 
HEPATITIS C VIRUS VACCINE 

RELATED APPLICATIONS 
5 The present application claims priority to provisional applications U.S. 

Serial No. 60/363,774, filed March 13, 2002, and U.S. Serial No. 60/328,655, filed 
October 11, 2001, each of which are hereby incorporated by reference herein. 

BACKGROUND OF THE INVENTION 

10 The references cited in the present application are not admitted to be 

prior art to the claimed invention. 

About 3% of the world's population are infected with the Hepatitis C 
virus (HCV). (Wasley et al, Semin. Liver Dis. 20, 1-16, 2000.) Exposure to HCV 
results in an overt acute disease in a small percentage of cases, while in most 

15 instances the virus establishes a chronic infection causing liver inflammation and 

slowly progresses into liver failure and cirrhosis. (Iwarson, FEMS Microbiol Rev. 14, 
201-204, 1994.) In addition, epidemiological surveys indicate an important role of 
HCV in the pathogenesis of hepatocellular carcinoma. (Kew, FEMS Microbiol Rev. 
14, 211-220, 1994, Alter, Blood 85, 1681-1695, 1995.) 

20 Prior to the implementation of routine blood screening for HCV in 

1992, most infections were contracted by inadvertent exposure to contaminated blood, 
blood products or transplanted organs. In those areas where blood screening of HCV 
is carried out, HCV is primarily contracted through direct percutaneous exposure to 
infected blood, i.e., intravenous drug use. Less frequent methods of transmission 

25 include perinatal exposure, hemodialysis, and sexual contact with an HCV infected 
person. (Alter et al, N. Engl J. Med 341(8), 556-562, 1999, Alter, /. Hepatol 31 
Suppl 88-91, 1999. Semin. Liver. Dis. 201, 1-16, 2000.) 

The HCV genome consists of a single strand RNA about 9.5 kb 
encoding a precursor polyprotein of about 3000 amino acids. (Choo et al, Science 

30 244, 362-364, 1989, Choo et al, Science 244, 359-362, 1989, Takamizawa et al, /. 
Virol 65, 1 105-1 1 13, 1991.) The HCV polyprotein contains the viral proteins in the 
order: C-El-E2-p7-NS2-NS3-NS4A-NS4B«NS5A-NS5B. 

Individual viral proteins are produced by proteolysis of the HCV 
polyprotein. Host cell proteases release the putative structural proteins C, El, E2, and 
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p7, and create the N-terminus of NS2 at amino acid 810. (Mizushima et al., J. Virol. 
68, 2731-2734, 1994, Hijikata et al, P.NA.S. USA 90, 10773-10777, 1993.) 

The non-structural proteins NS3, NS4A, NS4B, NS5A and NS5B 
presumably form the virus replication machinery and are released from the 
5 polyprotein. A zinc-dependent protease associated with NS2 and the N-terminus of 
NS3 is responsible for cleavage between NS2 and NS3. (Grakoui et al, J. Virol. 67, 
1385-1395, 1993, Hijikata et al, P.N.A.S. USA 90, 10773-10777, 1993.) A distinct 
serine protease located in the N-terminal domain of NS3 is responsible for proteolytic 
cleavages at the NS3/NS4A, NS4A/NS4B, NS4B/NS5A and NS5A/NS5B junctions. 
1 0 (Bartenschlager et al, J. Virol. 67, 3835-3844, 1993, Grakoui et al, Proc. Natl. Acad. 
Sci. USA 90, 10583-10587, 1993, Tomei et al., J. Virol. 67, 4017-4026, 1993.) 
NS4A provides a cofactor for NS3 activity. (Failla et al, J. Virol. 68, 3753-3760, 
1994, De Francesco et al, U.S. Patent No. 5,739,002.) 

NS5A is a highly phosphorylated protein conferring interferon 
1 5 resistance. (De Francesco et al, Semin. Liver Dis. , 20(1), 69-83, 2000, Pawlotsky, 
Viral Hepat. Suppl. 1, 47-48, 1999.) 

NS5B provides an RNA-dependent RNA polymerase. (De Francesco 
et al., International Publication Number WO 96/37619, Behrens et al, EMBO 15, 12- 
22, 1996, Lohmann etal, Virology 249, 108-118, 1998.) 

20 

SUMMARY OF THE INVENTION 

The present invention features Ad6 vectors and a nucleic acid encoding 
a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide containing an inactive NS5B 
RNA-dependent RNA polymerase region. The nucleic acid is particularly useful as a 
25 component of an adenovector or DNA plasmid vaccine providing a broad range of 
antigens for generating an HCV specific cell mediated immune (CMI) response 
against HCV. 

A HCV specific CMI response refers to the production of cytotoxic T 
lymphocytes and T helper cells that recognize an HCV antigen. The CMI response 
30 may also include non-HCV specific immune effects. 

Preferred nucleic acids encode a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide that is substantially similar to SEQ. ID. NO. 1 and has sufficient protease 
activity to process itself to produce at least a polypeptide substantially similar to the 
NS5B region present in SEQ. ID. NO. 1. The produced polypeptide corresponding to 
35 NS5B is enzymatically inactive. More preferably, the HCV polypeptide has sufficient 
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protease activity to produce polypeptides substantially similar to the NS3, NS4A, 
NS4B, NS5A, and NS5B regions present in SEQ. ID. NO. 1. 

Reference to a "substantially similar sequence" indicates an identity of at 
least about 65% to a reference sequence. Thus, for example, polypeptides having an amino 
5 acid sequence substantially similar to SEQ. ID. NO. 1 have an overall amino acid identity 
of at least about 65% to SEQ. ID. NO. 1. 

Polypeptides corresponding to NS3, NS4A, NS4B, NS5A, and NS5B 
have an amino acid sequence identity of at least about 65% to the corresponding 
region in SEQ. ID. NO. 1. Such corresponding polypeptides are also referred to 
10 herein as NS3, NS4A, NS4B, NS5A, and NS5B polypeptides. 

Thus, a first aspect of the present invention describes a nucleic acid 
comprising a nucleotide sequence encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide substantially similar to SEQ. ID. NO. 1. The encoded polypeptide has 
sufficient protease activity to process itself to produce an NS5B polypeptide that is 
15 enzymatically inactive. 

In a preferred embodiment, the nucleic acid is an expression vector 
capable of expressing the Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide in a 
desired human cell. Expression inside a human cell has therapeutic applications for 
actively treating an HCV infection and for prophylactically treating against an HCV 
20 infection. 

An expression vector contains a nucleotide sequence encoding a 
polypeptide along with regulatory elements for proper transcription and processing. 
The regulatory elements that may be present include those naturally associated with 
the nucleotide sequence encoding the polypeptide and exogenous regulatory elements 
25 not naturally associated with the nucleotide sequence. Exogenous regulatory elements 
such as an exogenous promoter can be useful for expression in a particular host, such 
as in a human cell. Examples of regulatory elements useful for functional expression 
include a promoter, a terminator, a ribosome binding site, and a polyadenylation 
signal. 

30 Another aspect of the present invention describes a nucleic acid 

comprising a gene expression cassette able to express in a human cell a Met-NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ. ID. NO. 1. The 
polypeptide can process itself to produce an enzymatically inactive NS5B protein. The 
gene expression cassette contains at least the following: 

3 
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a) a promoter transcriptionally coupled to a nucleotide sequence 
encoding a polypeptide; 

b) a 5' ribosome binding site functionally coupled to the nucleotide 

sequence, 

5 c) a terminator joined to the 3' end of the nucleotide sequence, and 

d) a 3' polyadenylation signal functionally coupled to the nucleotide 

sequence. 

Reference to "transcriptionally coupled" indicates that the promoter is 

positioned such that transcription of the nucleotide sequence can be brought about by 
10 RNA polymerase binding at the promoter. Transcriptionally coupled does not require 

that the sequence being transcribed is adjacent to the promoter. 

Reference to "functionally coupled" indicates the ability to mediate an 

effect on the nucleotide sequence. Functionally coupled does not require that the 

coupled sequences be adjacent to each other. A 3' polyadenylation signal functionally 
15 coupled to the nucleotide sequence facilitates cleavage and polyadenylation of the 

transcribed RNA. A 5' ribosome binding site functionally coupled to the nucleotide 

sequence facilitates ribosome binding. 

In preferred embodiments the nucleic acid is a DNA plasmid vector or 

an adenovector suitable for either therapeutic application in treating HCV or as an 
20 intermediate in the production of a therapeutic vector. Treating HCV includes 

actively treating an HCV infection and prophylactically treating against an HCV 

infection. 

Another aspect of the present invention describes an adenovector 
comprising a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette able to express 
25 a polypeptide substantially similar to SEQ. ID. NO. 1 that is produced by a process 
involving (a) homologous recombination and (b) adenovector rescue. The 
homologous recombinant step produces an adenovirus genome plasmid. The 
adenovector rescue step produces the adenovector from the adenogenome plasmid. 

Adenovirus genome plasmids described herein contain a recombinant 
30 adenovirus genome having a deletion in the El region and optionally in the E3 region 
and a gene expression cassette inserted into one of the deleted regions. The 
recombinant adenovirus genome is made of regions substantially similar to one or 
more adenovirus serotypes. 

Another aspect of the present invention describes an adenovector 
35 consisting of the nucleic acid sequence of SEQ. ID. NO. 4 or a derivative thereof, 
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wherein said derivative thereof has the HCV polyprotein encoding sequence present 
in SEQ. ID. NO. 4 replaced with the HCV polyprotein encoding sequence of either 
SEQ. ID. NO. 3, SEQ. ID. NO. 10 or SEQ. ID. NO. 11. 

Another aspect of the present invention describes a cultured 
5 recombinant cell comprising a nucleic acid containing a sequence encoding a Met- 
NS 3 -NS4 A-NS4B -NS 5 A-N S 5B polypeptide substantially similar to SEQ. ID. NO. 1. 
The recombinant cell has a variety of uses such as being used to replicate nucleic acid 
encoding the polypeptide in vector construction methods. 

Another aspect of the present invention describes a method of making 

10 an adenovector comprising a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette 
able to express a polypeptide substantially similar to SEQ. ID. NO. 1. The method 
involves the steps of (a) producing an adenovirus genome plasmid containing a 
recombinant adenovirus genome with deletions in the El and E3 regions and a gene 
expression cassette inserted into one of the deleted regions and (b) rescuing the 

15 adenovector from the adenovirus genome plasmid. 

Another aspect of the present invention describes a pharmaceutical 
composition comprising a vector for expressing a Met-NS3-NS4A-NS4B-NS5A- 
NS5B polypeptide substantially similar to SEQ. ID. NO. 1 and a pharmaceutically 
acceptable carrier. The vector is suitable for administration and polypeptide 

20 expression in a patient. 

A "patient" refers to a mammal capable of being infected with HCV. 
A patient may or may not be infected with HCV. Examples of patients are humans 
and chimpanzees. 

Another aspect of the present invention describes a method of treating 

25 a patient comprising the step of administering to the patient an effective amount of a 
vector expressing a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially 
similar to SEQ. ID. NO. 1. The vector is suitable for administration and polypeptide 
expression in the patient. 

The patient undergoing treatment may or may not be infected with 

30 HCV. For a patient infected with HCV, an effective amount is sufficient to achieve 
one or more of the following effects: reduce the ability of HCV to replicate, reduce 
HCV load, increase viral clearance, and increase one or more HCV specific CMI 
responses. For a patient not infected with HCV, an effective amount is sufficient to 
achieve one or more of the following: an increased ability to produce one or more 

35 components of a HCV specific CMI response to a HCV infection, a reduced 
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susceptibility to HCV infection, and a reduced ability of the infecting virus to 
establish persistent infection for chronic disease. 

Another aspect of the present invention features a recombinant nucleic 
acid comprising an Ad6 region and a region not present in Ad6. Reference to 
5 "recombinant" nucleic acid indicates the presence of two or more nucleic acid regions 
not naturally associated with each other. Preferably, the Ad6 recombinant nucleic 
acid contains Ad6 regions and a gene expression cassette coding for a polypeptide 

heterologous to Ad6. 

Other features and advantages of the present invention are apparent 
10 from the additional descriptions provided herein including the different examples. 
The provided examples illustrate different components and methodology useful in 
practicing the present invention. The examples do not limit the claimed invention. 
Based on the present disclosure the skilled artisan can identify and employ other 
components and methodology useful for practicing the present invention. 

15 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figures 1A and IB illustrate SEQ. ID. NO. 1. 

Figures 2A, 2B, 2C, and 2D illustrate SEQ. ID. NO. 2. SEQ. ID. NO. 

2 provides a nucleotide sequence coding for SEQ. ID. NO. 1 along with an optimized 
20 internal ribosome entry site and TAAA termination. Nucleotides 1-6 provides an 

optimized internal ribosome entry site. Nucleotides 7-5961 code for a HCV Met- 
NS3-NS4A-NS4B-NS5A-NS5B polypeptide with nucleotides in positions 5137 to 
5145 providing a AlaAlaGly sequence in amino acid positions 171 1 to 1713 that 
renders NS5B inactive. Nucleotides 5962-5965 provide a TAAA termination. 
25 Figures 3A, 3B, 3C, and 3D illustrate SEQ. ID. NO. 3. SEQ. ID. NO. 

3 is a codon optimized version of SEQ. ID. NO. 2. Nucleotides 7-5961 encode a 
HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. 

Figures 4A-4M illustrate MRKAd6-NSmut (SEQ. ID. NO. 4). SEQ. 
ID. NO. 4 is an adenovector containing an expression cassette where the polypeptide 

30 of SEQ. ID. NO. 1 is encoded by SEQ. ID. NO. 2. Base pairs 1-450 correspond to the 
Ad5 bp 1 to 450; base pairs 462 to 1252 correspond to the human CMV promoter; 
base pairs 1258 to 1267 correspond to the Kozak sequence; base pairs 1264 to 7222 
correspond to the NS genes; base pairs 7231 to 7451 correspond to the BGH 
polyadenylation signal; base pairs 7469 to 9506 correspond to Ad5 base pairs 3511 to 

35 5548; base pairs 9507 to 32121 correspond to Ad6 base pairs 5542 to 28156; base 
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pairs 32122 to 351 17 correspond to Ad6 base pairs 30789 to 33784; and base pairs 
351 18 to 37089 correspond to Ad5 base pairs 33967 to 35935. 

Figures 5A-50 illustrate SEQ. ID. NOs. 5 and 6. SEQ. ID. NO. 5 
encodes a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide with an active 
5 RNA dependent RNA polymerase. SEQ. ID. NO. 6 provides the amino acid sequence 
for the polypeptide. 

Figures 6A-6C provide the nucleic acid sequence for pVl JnsA (SEQ. 

ID. NO. 7). 

Figures 7A-7N provide the nucleic acid sequence for the Ad6 genome 
10 (SEQ. ID. NO. 8). 

Figures 8 A-8K provide the nucleic acid sequence for the Ad5 genome 

(SEQ. ID. NO. 9). 

Figure 9 illustrates different regions of the Ad6 genome. The linear 

(35759 bp) ds DNA genome is indicated by two parallel lines and is divided into 100 
15 map units. Transcription units are shown relative to their position and orientation in 

the genome. Early genes (El A, E1B, E2A/B, E3 and E4 are indicated by gray arrows. 

Late genes (LI to L5) , indicated by black arrows, are produced by alternative splicing 

of a transcript produced from the major late promoter (MLP) and all contain the 

tripartite leader (1, 2, 3) at their 5* ends. The El region is located from approximately 
20 1.0 to 1 1.5 map units, the E2 region from 75.0 to 1 1.5 map units, E3 from 76.1 to 86.7 

map units, and E4 from 99.5 to 91.2 map units. The major late transcription unit is 

located between 16.0 and 91.2 map units. 

Figure 10 illustrates homologous recombination to recover pAdEl-E3+ 

containing Ad6 and Ad5 regions. 
25 Figure 1 1 illustrates homologous recombinant to recover a pAdEl-E3+ 

containing Ad6 regions. 

Figure 12 illustrates a western blot on whole-cell extracts from 293 

cells transfected with plasmid DNA expressing different HCV NS cassettes. Mature 

NS3 and NS5A products were detected with specific antibodies. "pVl Jns-NS" refers 
30 to a pVl JnsA plasmid where a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide is 

encoded by SEQ. ID. NO. 5, and SEQ. ID. NO. 5 is inserted between bases 1881 and 

1912 of SEQ. ID. NO. 7. "pVUns-NSmut" refers to a pVl JnsA plasmid where SEQ. 

ID. NO. 2 is inserted between bases 1882 and 1925 of SEQ. ID. NO. 7. "pVUns- 

NSOPTmut" refers to a pVl JnsA plasmid where SEQ. ID. NO. 3 is inserted between 
35 bases 1881 and 1905 of SEQ. ID. NO. 7. 
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Figures 13A and 13B illustrate T cell responses by IFNy ELIspot 
induced in C57black6 mice (A) and BalbC mice (B) by two injections of 25\xg and 
50[ig, respectively, of plasmid DNA encoding the different HCV NS cassettes with 
Gene Electro-Transfer (GET). 
5 Figure 14 illustrates protein expression from different adenovectors 

upon infection of HeLa cells. MRKAd5-NSmut is an adenovector based on an Ad5 
sequence (SEQ. ID. NO. 9), where the Ad5 genome has an El deletion of base pairs 
451 to 3510, an E3 deletion of base pairs 28134 to 30817, and has the NS3-NS4A- 
NS4B-NS5A-NS5B expression cassette as provided in base pairs 451 to 7468 of SEQ. 
10 ID. NO. 4 inserted between positions 450 and 3511. Ad5-NS is an adenovector based 
on an Ad5 backbone with an El deletion of base pairs 342 to 3523, and E3 deletion of 
base pairs 28134 to 30817 and containing an expression cassette encoding a NS3- 
NS4A-NS4B-NS5A-NS5B from SEQ. ID. NO. 5. "MRKAd6-NSOPTmut" refers to 
an adenovector having a modified SEQ. ID. NO. 4 sequence, wherein base pairs 1258 
15 to 7222 of SEQ. ID. NO. 4 is replaced with SEQ. ID. NO. 3. 

Figure 15 illustrates T cell responses by IFNy ELIspot induced in 
C57black6 mice by two injections of 10 9 vp of adenovectors containing different 
HCV non-structural gene cassettes. 

Figures 16A-16D illustrate T cell responses by IFNy ELIspot induced 
20 in Rhesus monkeys by one or two injections of 10 10 vp (A) or 10 n vp (B) of 
adenovectors containing different HCV non-structural gene cassettes. 

Figures 17A and 17B illustrates CD8+ T cell responses by IFNy ICS 
induced in Rhesus monkeys by two injections of 10 10 vp (A) or 10 11 vp (B) of 
adenovectors encoding the different HCV non-structural gene cassettes. 
25 Figures 18A-18F illustrate T cell responses by bulk CTL assay induced 

in Rhesus monkeys by two injections of 10 n vp of Ad5-NS (A), MRKAdS-NSmut 
(B), or MRKAd6-NSmut (C). 

Figure 19 illustrates the plasmid pE2. 

Figures 20A-D illustrates the partial codon optimized sequence 
30 NSsuboptmut (SEQ. ID. NO. 10). Coding sequence for the Met-NS3-NS4A-NS4B- 
NS5A-NS5B polypeptide is from base 7 to 5961. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention features Ad6 vectors and nucleic acid encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide that contains an inactive NS5B 
region. Providing an inactive NS5B region supplies NS5B antigens while reducing the 
5 possibility of adverse side effects due to an active viral RNA polymerase. Uses of the 
featured nucleic acid include use as a vaccine component to introduce into a cell an 
HCV polypeptide that provides a broad range of antigens for generating a CMI 
response against HCV, and as an intermediate for producing such a vaccine 
component. 

10 The adaptive cellular immune response can function to recognize viral 

antigens in HCV infected cells throughout the body due to the ubiquitous distribution 
of major histocompatibility complex (MHC) class I and II expression, to induce 
immunological memory, and to maintain immunological memory. These functions 
are attributed to antigen-specific CD4+ T helper (Th) and CD8+ cytotoxic T cells 

15 (CTL). 

Upon activation via their specific T cell receptors, HCV specific Th 
cells fulfill a variety of immunoregulatory functions, most of them mediated by Thl 
and Th2 cytokines. HCV specific Th cells assist in the activation and differentiation 
of B cells and induction and stimulation of virus-specific cytotoxic T cells. Together 
20 with CTL, Th cells may also secrete DFN-y and TNF-cc that inhibit replication and 

gene expression of several viruses. Additionally, Th cells and CTL, the main effector 
cells, can induce apoptosis and lysis of virus infected cells. 

HCV specific CTL are generated from antigens processed by 
professional antigen presenting cells (pAPCs). Antigens can be either synthesized 
25 within or introduced into pAPCs. Antigen synthesis in a pAPC can be brought about 
by introducing into the cell an expression cassette encoding the antigen. 

A preferred route of nucleic acid vaccine administration is an 
intramuscular route. Intramuscular administration appears to result in the introduction 
and expression of nucleic acid into somatic cells and pAPCs. HCV antigens produced 
30 in the somatic cells can be transferred to pAPCs for presentation in the context of 
MHC class I molecules. (Donnelly et al, Annu. Rev. Immunol 75:617-648, 1997.) 

pAPCs process longer length antigens into smaller peptide antigens in 
the proteasome complex. The antigen is translocated into the endoplasmic 
reticulum/Golgi complex secretory pathway for association with MHC class I 
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proteins. CD8+ T lymphocytes recognize antigen associated with class I MHC via the 
T cell receptor (TCR) and the CD8 cell surface protein. 

Using a nucleic acid encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide as a vaccine component allows for production of a broad range of 

5 antigens capable of generating CMI responses from a single vector. The polypeptide 
should be able to process itself sufficiently to produce at least a region corresponding 
to NS5B. Preferred nucleic acids encode an amino acid sequence substantially similar 
to SEQ. ID. NO. 1 that has sufficient protease activity to process itself to produce 
individual HCV polypeptides substantially similar to the NS3, NS4A, NS4B, NS5A, 

10 and NS5B regions present in SEQ. ID. NO. 1. 

A polypeptide substantially similar to SEQ. ID. NO. 1 with sufficient 
protease activity to process itself in a cell provides the cell with T cell epitopes that 
are present in several different HCV strains. Protease activity is provided by NS3 and 
NS3/NS4A proteins digesting the Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide at 

1 5 the appropriate cleavage sites to release polypeptides corresponding to NS3, NS4A, 
NS4B, NS5A, and NS5B. Self- processing of the Met-NS3-NS4A-NS4B-NS5A- 
NS5B generates polypeptides that approximate naturally occurring HCV polypeptides. 

Based on the guidance provided herein a sufficiently strong immune 
response can be generated to achieve beneficial effects in a patient. The provided 

20 guidance includes information concerning HCV sequence selection, vector selection, 
vector production, combination treatment, and administration. 

T HCV SEQUENCES 
A variety of different nucleic acid sequences can be used as a vaccine 
25 component to supply a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide to a 
cell or as an intermediate to produce vaccine components. The starting point for 
obtaining suitable nucleic acid sequences are preferably naturally occurring NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide sequences modified to produce an inactive 
NS5B. 

30 The use of a HCV nucleic acid sequence providing HCV non-structural 

antigens to generate a CMI response is mentioned by Cho et al, Vaccine 77:1136- 
1 144, 1999, Paliard et al. International Publication Number WO 01/30812 (not 
admitted to be prior art to the claimed invention), and Coit et al, International 
Publication Number WO 01/38360 (not admitted to be prior art to the claimed 

35 invention). Such references fail to describe, for example, a polypeptide that processes 
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itself to produce an inactive NS5B, and the particular combinations of HCV 
sequences and delivery vehicles employed herein. 

Modifications to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide sequence can be produced by altering the encoding nucleic acid. 
5 Alterations can be performed to create deletions, insertions and substitutions. 

Small modifications can be made in NS5B to produce an inactive 
polymerase by targeting motifs essentially for replication. Examples of motifs critical 
for NS5B activity and modifications that can be made to produce an inactive NS5B 
are described by Lohmann et al, Journal of Virology 77:8416-8426, 1997, and 
10 Kolykhalov et al, Journal of Virology 74:2046-205 1 , 2000. 

Additional factors to take into account when producing modifications 
to a HCV Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide include maintaining the 
ability to self-process and maintaining T cell antigens. The ability of the HCV 
polypeptide to process itself is determined to a large extent by a functional NS3 
protease. Modifications that maintain NS3 activity protease activity can be obtained 
by taking into account the NS3 protein, NS4A which serves as a cofactor for NS3, and 
NS3 protease recognition sites present within the NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide. 

Different modifications can be made to naturally occurring NS3- 
NS4A-NS4B-NS5A-NS5B polypeptide sequences to produce polypeptides able to 
elicit a broad range of T cell responses. Factors influencing the ability of a 
polypeptide to elicit a broad T cell response include the preservation or introduction 
of HCV specific T cell antigen regions and prevalence of different T cell antigen 
regions in different HCV isolates. 

Numerous examples of naturally occurring HCV isolates are well 
known in the art. HCV isolates can be classified into the following six major 
genotypes comprising one or more subtypes: HCV-l/(la,lb,lc), HCV-2/(2a,2b,2c), 
HCV-3/(3a,3b,10a), HCV-4/(4a), HCV-5/(5a) and HCV-6/(6a,6b,7b,8b,9a,l la). 
(Simmonds, J. Gen. Virol, 693-712,2001.) Examples of particular HCV sequences 
such as HCV-BK, HCV-J, HCV-N, HCV-H, have been deposited in GenBank and 
described in various publications. (See, for example, Chamberlain et al, J. Gen. 
Virol, 1341-1347, 1997.) 

HCV T cell antigens can be identified by, for example, empirical 
experimentation. One way of identifying T cell antigens involves generating a series 
of overlapping short peptides from a longer length polypeptide and then screening the 
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T-cell populations from infected patients for positive clones. Positive clones are 
activated/primed by a particular peptide. Techniques such as IFNy-ELISPOT, EFNy- 
Intracellular staining and bulk CTL assays can be used to measure peptide activity. 
Peptides thus identified can be considered to represent T-cell epitopes of the 

5 respective pathogen. 

HCV T cell antigen regions from different HCV isolates can be 
introduced into a single sequence by, for example, producing a hybrid NS3-NS4A- 
NS4B-NS5A-NS5B polypeptide containing regions from two or more naturally 
occurring sequences. Such a hybrid can contain additional modifications, which 
10 preferably do not reduce the ability of the polypeptide to produce an HCV CMI 
response. 

The ability of a modified Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide to process itself and produce a CMI response can be determined using 
techniques described herein or well known in the art. Such techniques include the use 
1 5 of IFNy-ELISPOT, IFNy-Intracellular staining and bulk CTL assays to measure a 
HCV specific CMI response. 

A. Met-NS3-NS4A-NS4B-NS5A-NS5B Sequences 

SEQ. ID. NO. 1 provides a preferred Met-NS3-NS4A-NS4B-NS5A- 

20 NS5B sequence. SEQ. ID. NO. 1 contains a large number of HCV specific T cell 

antigens that are present in several different HCV isolates. SEQ. ID. NO. 1 is similar 
to the NS3-NS4A-NS4B-NS5A-NS5B portion of the HCV BK strain nucleotide 
sequence (GenBank accession number M58335). 

In SEQ. ID. NO. 1 anchor positions important for recognition by MHC 

25 class I molecules are conserved or represent conservative substitutions for 1 8 out of 
20 known T-cell epitopes in the NS3-NS4A-NS4B-NS5A-NS5B portion of HCV 
polyproteins. With respect to the remaining two known T-cell epitopes, one has a 
non-conservative anchor substitution in SEQ. ID. NO. 1 that may still be recognized 
by a different HLA supertype and one epitope has one anchor residue not conserved. 

30 HCV T-cell epitopes are described in Chisari et al., Curr. Top. Microbiol Immunol., 
242:299-325, 2000, and Lechner et al. J. Exp. Med. 9:1499-1512, 2000. 

Differences between the HCV-BK NS3-NS4A-NS4B-NS5A-NS5B 
nucleotide sequence and SEQ. ID. NO. 1 include the introduction of a methionine at 
the 5' end and the presence of modified NS5B active site residues in SEQ. ID. NO. 1. 
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The modification replaces GlyAspAsp with AlaAlaGly (residues 171 1-1713) to 
inactivate NS5B. 

The encoded HC V Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide 
preferably has an amino acid sequence substantially similar to SEQ. ID. NO. 1. In 
5 different embodiments, the encoded HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptide has an amino acid identify to SEQ. ID. NO. 1 of at least 65%, at least 
75%, at least 85%, at least 95%, at least 99% or 100%; or differs from SEQ. ID. NO. 
1 by 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 1-14, 1-15, 1-16, 1- 
17, 1-18, 1-19, or 1-20 amino acids. 
10 Amino acid differences between a Met-NS3-NS4A-NS4B-NS5A- 

NS5B polypeptide and SEQ. ID. NO. 1 are calculated by determining the minimum 
number of amino acid modifications in which the two sequences differ. Amino acid 
modifications can be deletions, additions, substitutions or any combination thereof. 

Amino acid sequence identity is determined by methods well known in 
15 the art that compare the amino acid sequence of one polypeptide to the amino acid 
sequence of a second polypeptide and generate a sequence alignment. Amino acid 
identity is calculated from the alignment by counting the number of aligned residue 
pairs that have identical amino acids. 

Methods for determining sequence identity include those described by 
20 Schuler, G.D. in Bioinformatics: A Practical Guide to the Analysis of Genes and 
Proteins, Baxevanis, A.D. and Ouelette, B.F.F., eds., John Wiley & Sons, Inc, 2001; 
Yona, et aL, in Bioinformatics: Sequence, structure and databanks, Higgins, D. and 
Taylor, W. eds, Oxford University Press, 2000; and Bioinformatics: Sequence and 
Genome Analysis, Mount, D.W., ed., Cold Spring Harbor Laboratory Press, 2001). 
Methods to determine amino acid sequence identity are codified in publicly available 
computer programs such as GAP (Wisconsin Package Version 10.2, Genetics 
Computer Group (GCG), Madison, Wise), BLAST (Altschul et a/., Mol Biol 
275^:403-10, 1990), and FASTA (Pearson, Methods in Enzymology 755:63-98, 
1990, R.F. Doolittle, ed.). 

In an embodiment of the present invention sequence identity between 
two polypeptides is determined using the GAP program (Wisconsin Package Version 
10.2, Genetics Computer Group (GCG), Madison, Wise). GAP uses the alignment 
method of Needleman and Wunsch. (Needleman, et ai t J. Mol Biol 45:443-453, 
1970.) GAP considers all possible alignments and gap positions between two 
sequences and creates a global alignment that maximizes the number of matched 

13 



WO 03/031588 



PCT/US02/32512 



residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. In addition, a gap creation penalty and a gap 
extension penalty are required to limit the insertion of gaps into the alignment. 
Default program parameters for polypeptide comparisons using GAP are the 

5 BLOSUM62 (Henikoff et al, Proc. Nail. Acad. Sci. USA, 89: 10915-10919, 1992) 
amino acid scoring matrix (MATrix=blosum62.cmp), a gap creation parameter 
(GAPweight=8) and a gap extension pararameter (LENgthweight=2). 

More preferred HCV Met-NS3-NS4A-NS4B-NS5A-NS5B 
polypeptides in addition to being substantially similar to SEQ. ID. NO. 1 across their 

10 entire length produce individual NS3, NS4A, NS4B, NS5A and NS5B regions that are 
substantially similar to the corresponding regions present in SEQ. ID. NO. 1. The 
corresponding regions in SEQ. ID. NO. 1 are provided as follows: Met-NS3 amino 
acids 1-632; NS4A amino acids 633-686; NS4B amino acids 687-947; NS5A amino 
acids 948-1394; and NS5B amino acids 1395-1985. 

1 5 in different embodiments a NS3, NS4A, NS4B, NS5 A and/or NS5B 

region has an amino acid identity to the corresponding region in SEQ. ID. NO. 1 of at 
least 65%, at least 75%, at least 85%, at least 95%, at least 99%, or 100%; or an 
amino acid difference of 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, 1-11, 1-12, 1-13, 
1-14, 1-15, 1-16, 1-17, 1-18, 1-19, or 1-20 amino acids. 

20 Amino acid modifications to SEQ. ID. NO. 1 preferably maintain all or 

most of the T-cell antigen regions. Differences in naturally occurring amino acids are 
due to different amino acid side chains (R groups). An R group affects different 
properties of the amino acid such as physical size, charge, and hydrophobicity. 
Amino acids can be divided into different groups as follows: neutral and hydrophobic 

25 (alanine, valine, leucine, isoleucine, proline, tyrptophan, phenylalanine, and 
methionine); neutral and polar (glycine, serine, threonine, tryosine, cysteine, 
asparagine, and glutamine); basic (lysine, arginine, and histidine); and acidic (aspartic 

acid and glutamic acid). 

Generally, in substituting different amino acids it is preferable to 
30 exchange amino acids having similar properties. Substituting different amino acids 
within a particular group, such as substituting valine for leucine, arginine for lysine, 
and asparagine for glutamine are good candidates for not causing a change in 
polypeptide tertiary structure. 

Starting with a particular amino acid sequence and the known 
35 degeneracy of the genetic code, a large number of different encoding nucleic acid 
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sequences can be obtained. The degeneracy of the genetic code arises because almost 
all amino acids are encoded by different combinations of nucleotide triplets or 
"codons". The translation of a particular codon into a particular amino acid is well 
known in the art (see, e.g., Lewin GENES IV 9 p. 119, Oxford University Press, 1990). 
5 Amino acids are encoded by codons as follows: 
A=Ala=Alanine: codons GCA, GCC, GCG, GCU 
C=Cys=Cysteine: codons UGC, UGU 
D=Asp=Aspartic acid: codons GAC, GAU 
E=Glu=Glutamic acid: codons GAA, GAG 
1 0 F=Phe=Phenylalanine: codons UUC, UUU 

G=Gly=Glycine: codons GGA, GGC, GGG, GGU 
H=His=Histidine: codons CAC, CAU 
I=He=Isoleucine: codons AUA, AUC, AUU 
K=Lys=Lysine: codons AAA, AAG 
1 5 L=Leu=Leucine: codons UUA, UUG, CUA, CUC, CUG, CUU 
M=Met=Methioninc: codon AUG 
N=Asn=Asparagine: codons AAC, AAU 
p=Pro=Proline: codons CCA, CCC, CCG, CCU 
Q=Gln=Glutamine: codons CAA, CAG 
20 R=Arg=Arginine: codons AGA, AGG, CGA, CGC, CGG, CGU 
S=Ser=Serine: codons AGC, AGU, UCA, UCC, UCG, UCU 
T=Thr=Threonine: codons ACA, ACC, ACG, ACU 
V=Val=Valine: codons GUA, GUC, GUG, GUU 
W=Trp=Tryptophan: codon UGG 
Y=Tyr=Tyrosine: codons UAC, UAU. 

Nucleic acid sequences can be optimized in an effort to enhance 
expression in a host. Factors to be considered include C:G content, preferred codons, 
and the avoidance of inhibitory secondary structure. These factors can be combined 
in different ways in an attempt to obtain nucleic acid sequences having enhanced 
expression in a particular host. (See, for example, Donnelly et aL> International 
Publication Number WO 97/47358.) 

The ability of a particular sequence to have enhanced expression in a 
particular host involves some empirical experimentation. Such experimentation 
involves measuring expression of a prospective nucleic acid sequence and, if needed, 
altering the sequence. 
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B. Encoding Nucleotide Sequences 

SEQ. ID. NOs. 2 and 3 provide two examples of nucleotide sequences 

encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B sequence. The coding sequence of 
5 SEQ. ID. NO. 2 is similar (99.4% nucleotide sequence identity) to the NS3-NS4A- 

NS4B-NS5A-NS5B region of the naturally occurring HCV-BK sequence (GenBank 

accession number M58335). SEQ. ID. NO. 3 is a codon-optimized version of SEQ. 

ID. NO. 2. SEQ. ID. NOs. 2 and 3 have a nucleotide sequence identity of 78.3%. 

Differences between the HCV-BK NS3-NS4A-NS4B-NS5A-NS5B 
10 nucleotide (GenBank accession number M58335) and SEQ. ID. NO. 2, include SEQ. 

ID. NO. 2 having a ribosome binding site, an ATG methionine codon, a region coding 

for a modified NS5B catalytic domain, a TAAA stop signal and an additional 30 

nucleotide differences. The modified catalytic domain codes for a AlaAlaGly 

(residues 1711-1713) instead of GlyAspAsp to inactivate NS5B. 
1 5 A nucleotide sequence encoding a HC V Met-NS3-NS4 A-NS4B- 

NS5A-NS5B polypeptide is preferably substantially similar to the SEQ. ID. NO. 2 

coding region. In different embodiments, the nucleotide sequence encoding a HCV 

Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide has a nucleotide sequence identify 

to the SEQ. ID. NO. 2 coding region of at least 65%, at least 75%, at least 85%, at 
20 least 95%, at least 99%, or 100%; or differs from SEQ. ID. NO. 2 by 1-2, 1-3, 1-4, 1- 

5, 1-6, 1-7, 1-8, 1-9, 1-10, Ml, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 

1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 nucleotides. 

Nucleotide differences between a sequence coding Met-NS3-NS4A- 

NS4B-NS5A-NS5B and the SEQ. ID. NO. 2 coding region are calculated by 
25 determining the minimum number of nucleotide modifications in which the two 

sequences differ. Nucleotide modifications can be deletions, additions, substitutions 

or any combination thereof. 

Nucleotide sequence identity is determined by methods well known in 

the art that compare the nucleotide sequence of one sequence to the nucleotide 
30 sequence of a second sequence and generate a sequence alignment. Sequence identity 

is determined from the alignment by counting the number of aligned positions having 

identical nucleotides. 

Methods for determining nucleotide sequence identity between two 

polynucleotides include those described by Schuler, in Bioinformatics: A Practical 
35 Guide to the Analysis of Genes and Proteins, Baxevanis, A.D. and Ouelette, B.F.F., 
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eds., John Wiley & Sons, Inc, 2001; Yona et al t . in Bioinformatics: Sequence, 
structure and databanks, Higgins, D. and Taylor, W. eds, Oxford University Press, 
2000; and Bioinformatics: Sequence and Genome Analysis, Mount, D.W., ed., Cold 
Spring Harbor Laboratory Press, 2001). Methods to determine nucleotide sequence 
5 identity are codified in publicly available computer programs such as GAP 

(Wisconsin Package Version 10.2, Genetics Computer Group (GCG), Madison, 
Wise), BLAST (Altschul etal.J. Mol Biol 275(5;:403-10, 1990), andFASTA 
(Pearson, W.R., Methods in Enzymology 753:63-98, 1990, R.F. Doolittle, ed.). 

In an embodiment of the present ivnention, sequence identity between 

10 two polynucleotides is determined by application of GAP (Wisconsin Package 
Version 10.2, Genetics Computer Group (GCG), Madison, Wise). GAP uses the 
alignment method of Needleman and Wunsch. (Needleman et al, /. Mol Biol 
45:443-453, 1970.) GAP considers all possible alignments and gap positions between 
two sequences and creates a global alignment that maximizes the number of matched 

1 5 residues and minimizes the number and size of gaps. A scoring matrix is used to 
assign values for symbol matches. Li addition, a gap creation penalty and a gap 
extension penalty are required to limit the insertion of gaps into the alignment. 
Default program parameters for polynucleotide comparisons using GAP are the 
nwsgapdnaxmp scoring matrix (MATrix=nwsgapdna.cmp), a gap creation parameter 

20 (GAPweight=50) and a gap extension pararameter (LENgthweight=3). 

More preferred HCV Met-NS3-NS4A-NS4B-NS5A-NS5B nucleotide 
sequences in addition to being substantially similar across its entire length, produce 
individual NS3, NS4A, NS4B, NS5A and NS5B regions that are substantially similar 
to the corresponding regions present in SEQ. ID. NO. 2. The corresponding coding 

25 regions in SEQ. ID. NO. 2 are provided as follows: Met-NS3, nucleotides 7-1902; 
NS4A nucleotides 1903-2064; NS4B nucleotides 2065-2847; NS5A nucleotides 
2848-4188: NS5B nucleotides 4189-5661. 

In different embodiments a NS3, NS4A, NS4B, NS5A and/or NS5B 
encoding region has a nucleotide sequence identity to the corresponding region in 

30 SEQ. ID. NO. 2 of at least 65%, at least 75%, at least 85%, at least 95%, at least 99% 
or 100%; or a nucleotide difference to SEQ. ID. NO. 2 of 1-2, 1-3, 1-4, 1-5, 1-6, 1-7, 
1-8, 1-9, 1-10, Ml, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 1-25, 1-30, 
1-35, 1-40, 1-45, or 1-50 nucleotides. 

35 
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C. Gene Expression Cassettes 

A gene expression cassette contains elements needed for polypeptide 
expression. Reference to "polypeptide" does not provide a size limitation and 
includes protein. Regulatory elements present in a gene expression cassette generally 
5 include: (a) a promoter transcriptionally coupled to a nucleotide sequence encoding 
the polypeptide, (b) a 5' ribosome binding site functionally coupled to the nucleotide 
sequence, (c) a terminator joined to the 3' end of the nucleotide sequence, and (d) a 3' 
polyadenylation signal functionally coupled to the nucleotide sequence. Additional 
regulatory elements useful for enhancing or regulating gene expression or polypeptide 

10 processing may also be present. 

Promoters are genetic elements that are recognized by an RNA 
polymerase and mediate transcription of downstream regions. PrefeiTed promoters 
are strong promoters that provide for increased levels of transcription. Examples of 
strong promoters are the immediate early human cytomegalovirus promoter (CMV), 

1 5 and CMV with intron A. (Chapman et al, Nucl. Acids Res. 19:3979-3986, 1991.) 
Additional examples of promoters include naturally occurring promoters such as the 
EF1 alpha promoter, the murine CMV promoter, Rous sarcoma virus promoter, and 
SV40 early/late promoters and the p-actin promoter; and artificial promoters such as a 
synthetic muscle specific promoter and a chimeric muscle-specific/CMV promoter (Li 

20 et aU Nat. Biotechnol. 1 7:241-245, 1999, Hagstrom et aL, Blood 95:2536-2542, 
2000). 

The ribosome binding site is located at or near the initiation codon. 
Examples of preferred ribosome binding sites include CCACCAUGG, 
CCGCCAUGG, and ACCAUGG, where AUG is the initiation codon. (Kozak, Cell 
25 44:283-292, 1986). Another example of a ribosome binding site is GCCACCAUGG 

(SEQ.JDD. NO. 12). 

The polyadenylation signal is responsible for cleaving the transcribed 

RNA and the addition of a poly (A) tail to the RNA. The polyadenylation signal in 

higher eukaryotes contains an AAUAAA sequence about 11-30 nucleotides from the 
30 polyadenylation addition site. The AAUAAA sequence is involved in signaling RNA 

cleavage. (Lewin, Genes IV, Oxford University Press, NY, 1990.) The poly (A) tail 

is important for the mRNA processing. 

Polyadenylation signals that can be used as part of a gene expression 

cassette include the minimal rabbit 0 -globin polyadenylation signal and the bovine 
35 growth hormone polyadenylation (BGH). (Xu et al, Gene 272:149-156, 2001, Post et 
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al t U.S. Patent U. S. 5,122,458.) Additional examples include the Synthetic 
Polyadenylation Signal (SPA) and SV40 polyadenylation signal. The SPA sequence 
is as follows: AAUAAAAGAUCUUUAUUUUCAUUAGAUCUGUGUG 
UUGGUUUUUUGUGUG (SEQ. ID. NO. 13). 
5 Examples of additional regulatory elements useful for enhancing or 

regulating gene expression or polypeptide processing that may be present include an 
enhancer, a leader sequence and an operator. An enhancer region increases 
transcription. Examples of enhancer regions include the CMV enhancer and the 
SV40 enhancer. (Hitt et al, Methods in Molecular Genetics 7:13-30, 1995, Xu, et al, 

10 Gene 272:149-156, 2001.) An enhancer region can be associated with a promoter. 

A leader sequence is an amino acid region on a polypeptide that directs 
the polypeptide into the proteasome. Nucleic acid encoding the leader sequence is 5' 
of a structural gene and is transcribed along the structural gene. An example of a 
leader sequences is tPA. 

15 An operator sequence can be used to regulate gene expression. For 

example, the Tet operator sequence can be used to repress gene expression. 

n. THERAPEUTIC VECTORS 
Nucleic acid encoding a Met-NS3-NS4A-NS4B-NS5 A-NS5B 
20 polypeptide can be introduced into a patient using vectors suitable for therapeutic 
administration. Suitable vectors can deliver nucleic acid into a target cell without 
causing an unacceptable side effect. 

Cellular expression is achieved using a gene expression cassette 
encoding a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. The gene expression 
25 cassette contains regulatory elements for producing and processing a sufficient 
amount of nucleic acid inside a target cell to achieve a beneficial effect. 

Examples of vectors that can be used for therapeutic applications 
include first and second generation adenovectors, helper dependent adenovectors, 
adeno-associated viral vectors, retroviral vectors, alpha virus vectors, Venezuelan 
30 Equine Encephalitis virus vector, and plasmid vectors. (Hitt, et al, Advances in 
Pharmacology 40:137-206, 1997, Johnston etal. t U.S. Patent No. 6,156,588, and 
Johnston et al, International Publication Number WO 95/32733.) Preferred vectors 
for introducing a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide into a subject are 
first generation adenoviral vectors and plasmid DNA vectors. 
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A. First Generation Adeno vectors 

First generation adenovector for expressing a gene expression cassette 
contain the expression cassette in an El and optionally E3 deleted recombinant 
adenovirus genome. The deletion in the El region is sufficiently large to remove 
5 elements needed for adenoviral replication. 

First generation adenovectors for expressing a Met-NS3-NS4A-NS4B- 
NS5A-NS5B polypeptide contain a El and E3 deleted recombinant adenovirus 
genome. The deletion in the El region is sufficiently large to remove elements 
needed for adenoviral replication. The combinations of deletions of the El and E3 
10 regions are sufficiently large to accommodate a gene expression cassette encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide. 

The adenovirus has a double-stranded linear genome with inverted 
terminal repeats at both ends. During viral replication, the genome is packaged inside 
a viral capsid to form a virion. The virus enters its target cell through viral attachment 
15 followed by internalization. (Hitt et al, Advances in Pharmacology 40:137-206, 
1997.) 

Adenovectors can be based on different adenovirus serotypes such as 
those found in humans or animals. Examples of animal adenoviruses include bovine, 
porcine, chimp, murine, canine, and avian (CELO). Preferred adenovectors are based 
20 on human serotypes, more preferably Group B, C, or D serotypes. Examples of 

human adenovirus Group B, C, D, or E serotypes include types 2 ("Ad2"), 4 ("Ad4"), 
5 ("Ad5"), 6 ("Ad6"X 24 ("Ad24"), 26 ("Ad26"), 34 (" Ad34") and 35 ("Ad35"). 
Adenovectors can contain regions from a single adenovirus or from two or more 
adenovirus. 

25 In different embodiments adenovectors are based on Ad5, Ad6, or a 

combination thereof. Ad5 is described by Chroboczek, et al, J. Virology 756:280- 
285, 1992. Ad6 is described in Figures 7A-7N. An Ad6 based vector containing 
Ad5 regions is described in the Example section provided below. 

Adenovectors do not need to have their El and E3 regions completely 

30 removed. Rather, a sufficient amount the El region is removed to render the vector 
replication incompetent in the absence of the El proteins being supplied in trans; and 
the El deletion or the combination of the El and E3 deletions are sufficiently large 
enough to accommodate a gene expression cassette. 

El deletions can be obtained starting at about base pair 342 going up to 

35 about base pair 3523 of Ad5, or a corresponding region from other adenoviruses. 
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Preferably, the deleted region involves removing a region from about base pair 450 to 
about base pair 351 1 of Ad5, or a corresponding region from other adenoviruses. 
Larger El region deletions starting at about base pair 341 removes elements that 
facilitate virus packaging. 
5 E3 deletions can be obtained starting at about base pair 27865 to about 

base pair 30995 of Ad5, or the corresponding region of other adenovectors. 
Preferably the deletion region involves removing a region from about base pair 28134 
up to about base pair 30817 of Ad5, or the corresponding region of other 
adenovectors. 

10 The combination of deletions to the El region and optionally the E3 

region should be sufficiently large so that the overall size of the recombinant genome 
containing the gene expression cassette does not exceed about 105% of the wild type 
adenovirus genome. For example, as recombinant adenovirus Ad5 genomes increase 
size above about 105% the genome becomes unstable. (Bett et al f Journal of 
15 Virology 67:5911-5921, 1993.) 

Preferably, the size of the recombinant adenovirus genome containing 
the gene expression cassette is about 85% to about 105% the size of the wild type 
adenovirus genome. In different embodiments, the size of the recombinant 
adenovirus genome containing the expression cassette is about 100% to about 
20 105.2%, or about 100%, the size of the wild type genome. 

Approximately 7,500 kb can be inserted into an adenovirus genome 
with a El and E3 deletion. Without any deletion, the Ad5 genome is 35,935 base 
pairs and the Ad6 genome is 35,759 base pairs. 

Replication of First generation adenovectors can be performed by 
supplying the El gene products in trans. The El gene product can be supplied in 
trans, for example, by using cell lines that have been transformed with the adenovirus 
El region. Examples of cells and cells lines transformed with the adenovirus El 
region are HEK 293 cells, 911 cells, PERC6™ cells, and transfected primary human 
aminocytes cells. (Graham et aL, Journal of Virology 36:59-72, 1977, Schiedner et 
al, Human Gene Therapy 77:2105-2116, 2000, Fallaux et al y Human Gene Therapy 
9:1909-1917, 1998, Bout etai, U.S. Patent No. 6,033,908.) 

A Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette should be 
inserted into a recombinant adenovirus genome in the region corresponding to the 
deleted El region or the deleted E3 region. The expression cassette can have a 
parallel or anti-parallel orientation. In a parallel orientation the transcription direction 
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of the inserted gene is the same direction as the deleted El or E3 gene. In an anti- 
parallel orientation transcription the opposite strand serves as a template and the 
transcription direction is in the opposite direction. 

In an embodiment of the present invention the adenovector has a gene 
5 expression cassette inserted in the El deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 

orientation joined to the first region; 
10 c) a second adenovirus region from about base pair 35 1 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

15 28156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to the third region; and 

0 a fifth adenovirus region from about base pair 33967 to about 
20 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6 joined to the fourth region. 

In another embodiment of the present invention the adenovector has an 
expression cassette inserted in the E3 deleted region. The vector contains: 

a) a first adenovirus region from about base pair 1 to about base 
25 pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 
30 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 
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e) a fourth adenovirus region from about base pair 3081 8 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
5 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6, joined to the fourth region. 

In preferred different embodiments concerning adenovirus regions that 
are present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) 
the first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first 
10 region corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5. 

B. DN A Plasmid Vectors 
15 DNA vaccine plasmid vectors contain a gene expression cassette along 

with elements facilitating replication and preferably vector selection. Preferred 
elements provide for replication in non-mammalian cells and a selectable marker. 
The vectors should not contain elements providing for replication in human cells or 
for integration into human nucleic acid. 
20 The selectable marker facilitates selection of nucleic acids containing 

the marker. Preferred selectable markers are those that confer antibiotic resistance. 
Examples of antibiotic selection genes include nucleic acid encoding resistance to 
ampicillin, neomycin, and kanamycin. 

Suitable DNA vaccine vectors can be produced starting with a plasmid 
containing a bacterial origin of replication and a selectable marker. Examples of 
bacterial origins of replication providing for higher yields include the ColEl plasmid- 
derived bacterial origin of replication. (Donnelly etal, Annu. Rev. Immunol 75:617- 
648, 1997.) 

The presence of the bacterial origin of replication and selectable 
marker allows for the production of the DNA vector in a bacterial strain such as E. 
coll The selectable marker is used to eliminate bacteria not containing the DNA 
vector. 
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EH. AD6 RECOMBINANT NUCLEIC ACID 
Ad6 recombinant nucleic acid comprises an Ad6 region substantially 
similar to an Ad6 region found in SEQ. ED. NO. 8, and a region not present in Ad6 
nucleic acid. Recombinant nucleic acid comprising Ad6 regions have different uses 
5 such as in producing different Ad6 regions, as intermediates in the production of Ad6 
based vectors, and as a vector for delivering a recombinant gene. 

As depicted in Figure 9, the genomic organization of Ad6 is very 
similar to the genomic organization of Ad5. The homology between Ad5 and Ad6 is 
approximately 98%. 

10 In different embodiments, the Ad6 recombinant nucleic acid comprises 

a nucleotide region substantially similar to El A, E1B, E2B, E2A, E3, E4, LI, L2, L3, 
or L4, or any combination thereof. A substantially similar nucleic acid region to an 
Ad6 region has a nucleotide sequence identity of at least 65%, at least 75%, at least 
85%, at least 95%, at least 99% or 100%; or a nucleotide difference of 1-2, 1-3, 1-4, 

15 1-5, 1-6, 1-7, 1-8, 1-9, 1-10, Ml, 1-12, 1-13, 1-14, 1-15, 1-16, 1-17, 1-18, 1-19, 1-20, 
1-25, 1-30, 1-35, 1-40, 1-45, or 1-50 nucleotides. Techniques and embodiments for 
determining substantially similar nucleic acid sequences are described in Section LB. 
supra. 

Preferably, the recombinant Ad6 nucleic acid contains an expression 
20 cassette coding for a polypeptide not found in Ad6. Examples of expression cassettes 
include those coding for HCV regions and those coding for other types of 
polypeptides. 

Different types of adenoviral vectors can be produced incorporating 
different amounts of Ad6, such as first and second generation adenovectors. As noted 
25 in Section HA. supra, first generation adenovectors are defective in El and can 
replicate when El is supplied in trans. 

Second generation adenovectors contain less adenoviral genome than 
first generation vectors and can be used in conjugation with complementing cell lines 
and/or helper vectors supplying adenoviral proteins. Second generation adenovectors 
30 are described in different references such as Russell, Journal of General Virology 
57:2573-2604, 2000; Hitt et ai y 1997, Human Ad vectors for Gene Transfer, 
Advances in Pharmacology, Vol 40 Academic Press. 

In an embodiment of the present invention, the Ad6 recombinant 
nucleic acid is an adenovirus vector defective in El that is able to replicate when El is 
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supplied in trans. Expression cassettes can be inserted into a deleted El region and/or 
a deleted E3 region. 

An example of an Ad6 based adenoviral vector with an expression 
cassette provided in a deleted El region comprises or consists of: 
5 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

c) a second adenovirus region from about base pair 351 1 to about 
10 base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

15 e) an optionally present fourth region from about base pair 28134 

to about base pair 30817 corresponding to Ad5, or from about base pair 28157 to 
about base pair 30788 corresponding to Ad6, joined to the third region; 

f) a fifth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

20 pair 33784 corresponding to Ad6, wherein the fifth region is joined to the fourth 
region if the fourth region is present, or the fifth is joined to the third region if the 
fourth region is not present; and 

g) a sixth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

25 pair 35759 corresponding to Ad6, joined to the fifth region; 

wherein at least one Ad6 region is present. 

In different embodiments of the invention, all of the regions are from 
Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2, 3, or 4 
regions selected from the second, third, fourth, and fifth regions are from Ad6. 

An example of an Ad6 based adenoviral vector with an expression 
cassette provided in a deleted E3 region comprises or consists of: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the first region; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

d) a gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 

e) a fourth adenovirus region from about base pair 308 1 8 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; 

15 wherein at least one Ad6 region is present. 

In different embodiment of the invention, all of the regions are from 
Ad6; all of the regions expect for the first and second are from Ad6; and 1, 2, 3, or 4 
regions selected from the second, third, fourth and fifth regions are from Ad6. 

20 IV. VECTOR PRODUCTION 

Vectors can be produced using recombinant nucleic acid techniques 
such as those involving the use of restriction enzymes, nucleic acid ligation, and 
homologous recombination. Recombinant nucleic acid techniques are well known in 
the art. (Ausubel, Current Protocols in Molecular Biology, John Wiley, 1987-1998, 

25 and Sambrook et al , Molecular Cloning, A Laboratory Manual 2 nd Edition, Cold 
Spring Harbor Laboratory Press, 1989.) 

Intermediate vectors are used to derive a therapeutic vector or to 
transfer an expression cassette or portion thereof from one vector to another vector. 
Examples of intermediate vectors include adenovirus genome plasmids and shuttle 

30 vectors. 

Useful elements in an intermediate vector include an origin of 
replication, a selectable marker, homologous recombination regions, and convenient 
restriction sites. Convenient restriction sites can be used to facilitate cloning or release 
of a nucleic acid sequence. 
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Homologous recombination regions provide nucleic acid sequence 
regions that are homologous to a target region in another nucleic acid molecule. The 
homologous regions flank the nucleic acid sequence that is being inserted into the 
target region. In different embodiments homologous regions are preferably about 150 
5 to 600 nucleotides in length, or about 100 to 500 nucleotides in length. 

An embodiment of the present invention describes a shuttle vector 
containing a Met-NS3-NS4A-NS4B-NS5A-NS5B expression cassette, a selectable 
marker, a bacterial origin of replication, a first adenovirus homology region and a 
second adenovirus homologous region that target the expression cassette to insert in 
10 or replace an El region. The first and second homology regions flank the expression 
cassette. The first homology region contains at least about 100 base pairs 
substantially homologous to at least the right end (3* end) of a wild-type adenovirus 
region from about base pairs 4-450. The second homology contains at least about 100 
base pairs substantially homologous to at least the left end (5' end) of Ad5 from about 
1 5 base pairs 35 1 1-5792, or the corresponding region from another adenovirus. 

Reference to "substantially homologous" indicates a sufficient degree 
of homology to specifically recombine with a target region. In different embodiments 
substantially homologous refers to at least 85%, at least 95%, or 100% sequence 
identity. Sequence identity can be calculated as described in Section LB. supra. 
20 One method of producing adenovectors is through the creation of an 

adenovirus genome plasmid containing an expression cassette. The pre-Adenovirus 
plasmid contains all the adenovirus sequences needed for replication in the desired 
complimenting cell line. The pre-Adenovirus plasmid is then digested with a 
restriction enzyme to release the viral ITR's and transfected into the complementing 
25 cell line for virus rescue. The ITR's must be released from plasmid sequences to 
allow replication to occur. Adenovector rescue results in the production on an 
adenovector containing the expression cassette. 

A. Adenovirus Genome Plasmids 
30 Adenovirus genome plasmids contain an adenovector sequence inside 

a longer-length plasmid (which may be a cosmid). The longer-length plasmid may 
contain additional elements such as those facilitating growth and selection in 
eukaryotic or bacterial cells depending upon the procedures employed to produce and 
maintain the plasmid. Techniques for producing adenovirus genome plasmids include 
those involving the use of shuttle vectors and homologous recombination, and those 
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involving the insertion of a gene expression cassette into an adenovirus cosmid. (Hitt 
et al, Methods in Molecular Genetics 7:13-30, 1995, Danthinne et at, Gene Therapy 
7:17074714, 2000.) 

Adenovirus genome plasmids preferably have a gene expression 
5 cassette inserted into a El or E3 deleted region. In an embodiment of the present 
invention, the adenovirus genome plasmid contains a gene expression cassette 
inserted in the El deleted region, an origin of replication, a selectable marker, and the 
recombinant adenovirus region is made up of: 

a) a first adenovirus region from about base pair 1 to about base 
10 450 corresponding to cither Ad5 or Ad6; 

b) a gene expression cassette in a El parallel or El anti-parallel 
orientation joined to the first region; 

c) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

15 5541 corresponding to Ad6, joined to the expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

e) a fourth adenovirus region from about base pair 308 1 8 to about 
20 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the third region; 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region, and 

25 g) an optionally present E3 region corresponding to all or part of 

the E3 region present in Ad5 or Ad6, which may be present for smaller inserts taking 

into account the overall size of the desired adenovector. 

In another embodiment of the present invention the recombinant 

adenovirus genome plasmid has the gene expression cassette inserted in the E3 
30 deleted region. The vector contains an origin of replication, a selectable marker, and 

the following: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 3511 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to the expression cassette; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to the second region; 

d) the gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to the third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to the gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region. 

1 5 In different embodiments concerning adenovirus regions that are 

present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) the 
first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first region 
corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
20 corresponds to Ad5. 

An embodiment of the present invention describes a method of making 
an adenovector involving a homologous recombination step to produce a adenovirus 
genome plasmid and an adenovirus rescue step. The homologous recombination step 
involves the use of a shuttle vector containing a Met-NS3-NS4A-NS4B-NS5A-NS5B 
expression cassette flanked by adenovirus homology regions. The adenovirus 
homology regions target the expression cassette into either the El or E3 deleted 
region. 

In an embodiment of the present invention concerning the production 
of an adenovirus genome plasmid, the gene expression cassette is inserted into a 
vector comprising: a first adenovirus region from about base pair 1 to about base pair 
450 corresponding to either Ad5 or Ad6; a second adenovirus region from about base 
pair 351 1 to about base pair 5548 corresponding to Ad5 or from about base pair 3508 
to about base pair 5541 corresponding to Ad6, joined to the second region; a third 
adenovirus region from about base pair 5549 to about base pair 28133 corresponding 
to Ad5 or from about base pair 5542 to about base pair 28156 coiTesponding to Ad6, 
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joined to the second region; a fourth adenovirus region from about base pair 30818 to 
about base pair 33966 corresponding to Ad5 or from about base pair 30789 to about 
base pair 33784 corresponding to Ad6, joined to the third region; and a fifth 
adenovirus region from about 33967 to about 35935 corresponding to Ad5 or from 
5 about base pair 33785 to about base pair 35759 corresponding to Ad6, joined to the 
fourth region. The adenovirus genome plasmid should contain an origin of replication 
and a selectable marker, and may contain all or part of the Ad5 or Ad6 E3 region. 

In different embodiments concerning adenovirus regions that are 
present: (1) the first, second, third, fourth, and fifth region corresponds to Ad5; (2) the 
10 first, second, third, fourth, and fifth region corresponds to Ad6; and (3) the first region 
corresponds to Ad5, the second region corresponds to Ad5, the third region 
corresponds to Ad6, the fourth region corresponds to Ad6, and the fifth region 
corresponds to Ad5. 

15 B. Adenovector Rescue 

An adenovector can be rescued from a recombinant adenovirus 
genome plasmid using techniques known in the art or described herein. Examples of 
techniques for adenovirus rescue well known in the art are provided by Hitt et al. 9 
Methods in Molecular Genetics 7:13-30, 1995, and Danthinne et aL, Gene Therapy 

20 7:1707-1714,2000. 

A preferred method of rescuing an adenovector described herein 
involves boosting adenoviral replication. Boosting adenoviral replication can be 
performed, for example, by supplying adenoviral functions such as E2 proteins 
(polymerase, pre-terminal protein and DNA binding protein) as well as E4 orf6 on a 

25 separate plasmid. Example 10 infra, illustrates the boosting of adenoviral replication 
to rescue an adenovector containing a codon optimized Met-NS3-NS4A-NS4B- 
NS5A-NS5B expression cassette. 

V. PARTIAL-OPITIMIZEDHCV ENCODING SEQUENCES 
30 Partial optimization of HCV polyprotein encoding nucleic acid 

provides for a lesser amount of codons optimized for expression in a human than 
complete optimization. The overall objective is to provide the benefits of increased 
expression due to codon optimization, while facilitating the production of an 
adenovector containing HCV polyprotein encoding nucleic acid having optimized 
35 codons. 
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Complete optimization of an HCV polyprotein encoding sequence 
provides the most frequently observed human codon for each amino acid. Complete 
optimization can be performed using codon frequency tables well known in the art 
and using programs such as the BACKTRANSLATE program (Wisconsin Package 
5 version 10, Genetics Computer Group, GCG, Madison, Wise). 

Partial optimization can be preformed on an entire HCV polyprotein 
encoding sequence that is present (e.g., NS3-NS5B), or one or more local regions that are 
present. In different embodiments the GC content for the entire HCV encoded polyprotein 
that is present is no greater than at least about 65%; and the GC content for one or more 
10 local regions is no greater than about 70%. 

Local regions are regions present in HCV encoding nucleic acid, and can 
vary in size. For example, local regions can be about 60, about 70, about 80, about 90 or 
about 100 nucleotides in length. 

Partial optimization can be achieved by initially constructing an HCV 
15 encoding polyprotein sequence to be partially optimized based on a naturally ocurring 
sequence. Alternatively, an optimized HCV encoding sequence can be used as basis of 
comparison to produce a partial optimized sequence. 

VL HCV COMBINATION TREATMENT 

20 The HCV Met-NS3-NS4A-NS4B-NS5A-NS5B vaccine can be used by 

itself to treat a patient, can be used in conjunction with other HCV therapeutics, and 
can be used with agents targeting other types of diseases. Additional therapeutics 
include additional therapeutic agents to treat HCV and diseases having a high 
prevalence in HCV infected persons. Agents targeting other types of disease include 

25 vaccines directed against HTV and HBV. 

Additional therapeutics for treating HCV include vaccines and non- 
vaccine agents. (Zein, Expert Opin. Investig. Drugs 70:1457-1469, 2001.) Examples 
of additional HCV vaccines include vaccines designed to elicit an immune response 
against an HCV core antigen and the HCV El, E2 or p7 region. Vaccine components 

30 can be naturally occurring HCV polypeptides, HCV mimotope polypeptides or nucleic 
acid encoding such polypeptides. 

HCV mimotope polypeptides contain HCV epitopes, but have a 
different sequence than a naturally occurring HCV antigen. A HCV mimotope can be 
fused to a naturally occurring HCV antigen. References describing techniques for 

35 producing mimotopes in general and describing different HCV mimotopes are 
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provided in Felici et al U.S. Patent No. 5,994,083 and Nicosia et al, International 
Application Number WO 99/60132. 

Vn. PHARMACEUTICAL ADMINISTRATION 
5 HCV vaccines can be formulated and administered to a patient using 

the guidance provided herein along with techniques well known in the art. Guidelines 
for pharmaceutical administration in general are provided in, for example, Modern 
Vaccinology, Ed. Kurstak, Plenum Med. Co. 1994; Remington's Pharmaceutical 
Sciences ifi"' Edition, Ed. Gennaro, Mack Publishing, 1990; and Modern 
10 Pharmaceutics 2 nd Edition, Eds. Banker and Rhodes, Marcel Dekker, Inc., 1990, each 
of which are hereby incorporated by reference herein. 

HCV vaccines can be administered by different routes such 
intravenous, intraperitoneal, subcutaneous, intramuscular, intradermal, impression 
through the skin, or nasal. A preferred route is intramuscular. 
1 5 Intramuscular administration can be preformed using different 

techniques such as by injection with or without one or more electric pulses. Electric 
mediated transfer can assist genetic immunization by stimulating both humoral and 
cellular immune responses. 

Vaccine injection can be performed using different techniques, such as 
20 by employing a needle or a needless injection system. An example of a needless 
injection system is a jet injection device. (Donnelly et al, International Publication 
Number WO 99/52463.) 

A. Electrically Mediated Transfer 

25 Electrically mediated transfer or Gene Electro-Transfer (GET) can be 

performed by delivering suitable electric pulses after nucleic acid injection. (See 
Mathiesen, International Publication Number WO 98/43702). Plasmid injection and 
electroporation can be performed using stainless needles. Needles can be used in 
couples, triplets or more complex patterns. In one configuration the needles are 

30 soldered on a printed circuit board that is a mechanical support and connects the 
needles to the electrical field generator by means of suitable cables. 

The electrical stimulus is given in the form of electrical pulses. Pulses 
can be of different forms (square, sinusoidal, triangular, exponential decay) and 
different polarity (monopolar of positive or negative polarity, bipolar). Pulses can be 

35 delivered either at constant voltage or constant current modality. 
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Different patterns of electric treatment can be used to introduce nucleic 
acid vaccines including HCV and other nucleic acid vaccines into a patient. Possible 
patterns of electric treatment include the following: 

Treatment 1: 10 trains of 1000 square bipolar pulses delivered every 
5 other second, pulse length 0.2 msec/phase, frequency 1000 Hz, constant voltage 
mode, 45 Volts/phase, floating current. 

Treatment 2: 2 trains of 100 square bipolar pulses delivered every other 
second, pulse length 2 msec/phase, frequency 100 Hz, constant current mode, 100 
mA/phase, floating voltage. 
10 Treatment 3: 2 trains of bipolar pulses at a pulse length of about 2 

msec/phase, for a total length of about 3 seconds, where the actual current going 
through the tissue is fixed at about 50 mA. 

Electric pulses are delivered through an electric field generator. A 
suitable generator can be composed of three independent hardware elements 
15 assembled in a common chassis and driven by a portable PC which runs the driving 
program. The software manages both basic and accessory functions. The elements of 
the device are: (1) signal generator driven by a microprocessor, (2) power amplifier 
and (3) digital oscilloscope. 

The signal generator delivers signals having arbitrary frequency and 
20 shape in a given range under software control. The same software has an interactive 
editor for the waveform to be delivered. The generator features a digitally controlled 
current limiting device (a safety feature to control the maximal current output). The 
power amplifier can amplify the signal generated up to +/- 150 V. The oscilloscope is 
digital and is able to sample both the voltage and the current being delivered by the 
25 amplifier. 

B. Pharmaceutical Carriers 

Pharmaceutically acceptable carriers facilitate storage and 
administration of a vaccine to a subject. Examples of pharmaceutically acceptable 
carriers are described herein. Additional pharmaceutical acceptable carriers are well 
known in the art. 

Pharmaceutically acceptable carriers may contain different components 
such a buffer, normal saline or phosphate buffered saline, sucrose, salts and 
polysorbate. An example of a pharmaceutically acceptable carrier is follows: 2.5-10 
mM TRIS buffer, preferably about 5 mM TRIS buffer; 25-100 mM NaCl, preferably 
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about 75 mM NaCl; 2.5-10% sucrose, preferably about 5% sucrose; 0.01 -2 mM 
MgCl 2 ; and 0.001%-0.01% polysorbate 80 (plant derived). The pH is preferably from 
about 7.0-9.0, more preferably about 8.0. A specific example of a carrier contains 5 
mM TRIS, 75 mM NaCl, 5% sucrose, 1 mM MgCl 2 , 0.005% polysorbate 80 at pH 
5 8.0. 

C. Dosing Regimes 

Suitable dosing regimens can be determined taking into account the 
efficacy of a particular vaccine and factors such as age, weight, sex and medical 

10 condition of a patient; the route of administration; the desired effect; and the number 
of doses. The efficacy of a particular vaccine depends on different factors such as the 
ability of a particular vaccine to produce polypeptide that is expressed and processed 
in a cell and presented in the context of MHC class I and II complexes. 

HCV encoding nucleic acid administered to a patient can be part of 

15 different types of vectors including viral vectors such as adenovector, and DNA 
plasmid vaccines. In different embodiments concerning administration of a DNA 
plasmid, about 0.1 to 10 mg of plasmid is administered to a patient, and about 1 to 5 
mg of plasmid is administered to a patient. In different embodiments concerning 
administration of a viral vector, preferably an adenoviral vector, about 105 to 10*1 

20 viral particles are administered to a patient, and about 10? to 10*0 viral particles are 
administered to a patient. 

Viral vector vaccines and DNA plasmid vaccines may be administered 
alone, or may be part of a prime and boost administration regimen. A mixed modality 
priming and booster inoculation involves either priming with a DNA vaccine and 

25 boosting with viral vector vaccine, or priming with a viral vector vaccine and boosting 
with a DNA vaccine. 

Multiple priming, for example, about to 2-4 or more may be used. The 
length of time between priming and boost may typically vary from about four months 
to a year, but other time frames may be used. The use of a priming regimen with a 

30 DNA vaccine may be preferred in situations where a person has a pre-existing anti- 
adenovirus immune response. 

In an embodiment of the present invention, IxlO 7 to lxlO 12 particles 
and preferably about lxlO 10 to lxlO 11 particles of adenovector is administered directly 
into muscle tissue. Following initial vaccination a boost is performed with an 

35 adenovector or DNA vaccine. 
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In another embodiment of the present invention initial vaccination is 
performed with a DNA vaccine directly into muscle tissue. Following initial 
vaccination a boost is performed with an adenovector or DNA vaccine. 

Agents such as interleukin-12, GM-CSF, B7-1, B7-2, IP10, Mig-1 can 
5 be coadministered to boost the immune response. The agents can be coadministered 
as proteins or through use of nucleic acid vectors. 

D. Heterologous Prime-Boost 

Heterologous prime-boost is a mixed modality involving the use of one 

10 type of viral vector for priming and another type of viral vector for boosting. The 
heterologous prime-boost can involve related vectors such as vectors based on 
different adenovirus serotypes and more distantly related viruses such adenovirus and 
poxvirus. The use of poxvirus and adenovirus vectors to protect mice against malaria 
is illustrated by Gilbert et a/., Vaccine 20:1039-1045, 2002. 

1 5 Different embodiments concerning priming and boosting involve the 

following types of vectors expressing desired antigens such as Met-NS3-NS4A- 
NS4B-NS5A-NS5B: Ad5 vector followed by Ad6 vector; Ad6 vector followed by 
Ad5 vector; Ad5 vector followed by poxvirus vector; poxvirus vector followed by 
Ad5 vector; Ad6 vector followed by poxvirus vector; and poxvirus vector followed by 

20 Ad6 vector. 

The length of time between priming and boosting typically varies from 
about four months to a year, but other time frames may be used. The minimum time 
frame should be sufficient to allow for an immunological rest. In an embodiment, this 
rest is for a period of at least 6 months. Priming may involve multiple priming with 
25 one type of vector, such as 2-4 primings. 

Expression cassettes present in a poxvirus vector should contain a 
promoter either native to, or derived from, the poxvirus of interest or another poxvirus 
member. Different strategies for constructing and employing different types of 
poxvirus based vectors including those based on vaccinia virus, modified vaccinia 
30 virus, avipoxvirus, raccoon poxvirus, modified vaccinia virus Ankara, 

canarypoxviruses (such as ALVAC), fowl pox viruses, cowpoxviruses, and NYVAC 
are well known in the art. (Moss, Current Topics in Microbiology and Immunology 
755:25-38, 1982; Earl et aL % In Current Protocols in Molecular Biology, Ausubel et 
al eds., New York: Greene Publishing Associates & Wiley Interscience; 
1991:16.16.1-16.16.7, Child etal, Virology 174(2):625-9, 1990; Tartaglia etaL, 
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Virology 188:217-232, 1992; U.S. Patent Nos., 4,603,112, 4,722,848, 4,769,330, 
5,110,587, 5,174,993, 5,185,146, 5,266,313, 5,505,941, 5,863,542, and 5,942,235. 

E. Adjuvants 

5 HCV vaccines can be formulated with an adjuvant. Adjuvants are 

particularly useful forDNA plasmid vaccines. Examples of adjuvants are alum, 
AIPO4, alhydrogel, Lipid-A and derivatives or variants thereof, Freund's incomplete 
adjuvant, neutral liposomes, liposomes containing the vaccine and cytokines, non- 
ionic block copolymers, and chemokines. 

20 Non-ionic block polymers containing polyoxyethylene (POE) and 

polyxylpropylene (POP), such as POE-POP-POE block copolymers may be used as an 
adjuvant. (Newman et al., Critical Reviews in Therapeutic Drug Carrier Systems 
75:89-142, 1998.) The immune response of a nucleic acid can be enhanced using a 
non-ionic block copolymer combined with an anionic surfactant. 

15 A specific example of an adjuvant formulation is one containing CRL- 

1005 (CytRx Research Laboratories), DNA, and benzylalkonium chloride (BAK). 
The formulation can be prepared by adding pure polymer to a cold (< 5°C) solution of 
plasmid DNA in PBS using a positive displacement pipette. The solution is then 
vortexed to solubilize the polymer. After complete solubilization of the polymer a 

20 clear solution is obtained at temperatures below the cloud point of the polymer (~6- 
7°C). Approximately 4 mM BAK is then added to the DNA/CRL-1005 solution in 
PBS, by slow addition of a dilute solution of BAK dissolved in PBS. The initial DNA 
concentration is approximately 6 rng/mL before the addition of polymer and BAK, 
and the final DNA concentration is about 5 mg/mL. After BAK addition the 

25 formulation is vortexed extensively, while the temperature is allowed to increase from 
~ 2°C to above the cloud point. The formulation is then placed on ice to decrease the 
temperature below the cloud point. Then, the formulation is vortexed while the 
temperature is allowed to increase from ~2°C to above the cloud point. Cooling and 
mixing while the temperature is allowed to increase from ~2°C to above the cloud 

30 point is repeated several times, until the particle size of the formulation is about 200- 
500 nm, as measured by dynamic light scattering. The formulation is then stored on 
ice until the solution is clear, then placed in storage at -70°C. Before use, the 
formulation is allowed to thaw at room temperature. 
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F. Vaccine Storage 

Adenovector and DNA vaccines can be stored using different types of 
buffers. For example, buffer A 105 described in Example 9 infra, can be used to for 
vector storage. 

5 Storage of DNA can be enhanced by removal or chelation of trace 

metal ions. Reagents such as succinic or malic acid, and chelators can be used to 
enhance DNA vaccine stability. Examples of chelators include multiple phosphate 
ligands and EDTA. The inclusion of non-reducing free radical scavengers, such as 
ethanol or glycerol, can also be useful to prevent damage of DNA plasmid from free 
10 radical production. Furthermore, the buffer type, pH, salt concentration, light 

exposure, as well as the type of sterilization process used to prepare the vials, may be 
controlled in the formulation to optimize the stability of the DNA vaccine. 

Vn. EXAMPLES 

15 Examples are provided below to further illustrate different features of 

the present invention. The examples also illustrate useful methodology for practicing 
the invention. These examples do not limit the claimed invention. 

Example 1: Met-NS3-NS4A-NS4B-NS5A-NS5B Expression Cassettes 

20 Different gene expression cassettes encoding HCV NS3-NS4A-NS4B- 

NS5A-NS5B were constructed based on a lb subtype HCV BK strain. The encoded 
sequences had either (1) an active NS5B sequence ("NS"), (2) an inactive NS5B 
sequence ("NSmut"), (3) a codon optimized sequence with an inactive NS5B 
sequence ("NSOPTmut"). The expression cassettes also contained a CMV 

25 promoter/enhancer and the BGH polyadenylation signal. 

The NS nucleotide sequence (SEQ. ID. NO. 5) differs from HCV BK 
strain GenBank accession number M58335 by 30 out of 5952 nucleotides. The NS 
amino acid sequence (SEQ. ID. NO. 6) differs from the corresponding lb genotype 
HCV BK strain by 7 out of 1984 amino acids. To allow for initiation of translation an 

30 ATG codon is present at the 5' end of the NS sequence. A TGA termination sequence 
is present at the 3' end of the NS sequence. 

The NSmut nucleotide sequence (SEQ. ID. NO. 2, Figure 2), is similar 
to the NS sequence. The differences between NSmut and NS include NSmut having 
an altered NS5B catalytic site; an optimal ribosome binding site at the 5* end; and a 

35 TAAA termination sequence at the 3' end. The alterations in NS5B comprise bases 



37 



WO 03/031588 



PCT/US02/32512 



5138 to 5146, which encode amino acids 1711 to 1713. The alterations result in a 
change of amino acids GlyAspAsp into AlaAlaGly and creates an inactive form of the 
NS5B RNA-dependent RNA-polymerase NS5B. 

The NSOPTmut sequence (SEQ. ID. NO. 3, Figure 3) was designed 

5 based on the amino acid sequence encoded by NSmut. The NSmut amino acid 
sequence was back translated into a nucleotide sequence with the GCG (Wisconsin 
Package version 10, Genetics Computer Group, GCG, Madison, Wise.) 
B ACKTR ANSLATE program. To generate a NSOPTmut nucleotide sequence where 
each amino acid is coded for by the corresponding most frequently observed human 

10 codon, the program was run choosing as parameter the generation of the most 
probable nucleotide sequence and specifying the codon frequency table of highly 
expressed human genes (humanjiigh.cod) available within the GCG Package as 
translation scheme. 

15 Exam ple 2: Generation pV l Jns plasmid with NS. NSm ut or NSOPTmut Sequences 
pVl Jns plasmids containing either the NS sequence, NSmut sequence 
or NSOPTmut sequences were generated and characterised as follows: 

pVUns Plasmid with the NS Sequence 

20 The coding region Met-NS3-NS4A-NS4B-NS5A and the coding 

region Met-NS3-NS4A-NS4B-NS5A-NS5B from a HCV BK type strain (Tomei et 
aL J- Virol 67:4017-4026, 1993) were cloned into pcDNA3 plasmid (Invitrogen), 
generating pcD3-5a and pcD3-5b vectors, respectively. PcD3-5A was digested with 
Hind m, blunt-ended with Klenow fill-in and subsequently digested with Xba I, to 

25 generate a fragment corresponding to the coding region of Met-NS3-NS4A-NS4B- 
NS5A. The fragment was cloned into pVl Jns-poly, digested with Bgl II blunt-ended 
with Klenow fill-in and subsequently digested with Xba I, generating pVUnsNS3-5A. 

pVl Jns-poly is a derivative of pVUnsA plasmid (Montgomery et al., 
DNA and Cell Biol 72:777-783, 1993), modified by insertion of a polylinker 

30 containing recognition sites for Xbal, Pmel, Pad into the unique Bgin and NotI 
restriction sites. The pVUns plasmid with the NS sequence (pVl JnsNS3-5B) was 
obtained by homologous recombination into the bacterial strain BJ5183, co- 
transforming pVUNS3-5A linearized with Xbal and NotI digestion and a PCR 
fragment containing approximately 200 bp of NS5A, NS5B coding sequence and 
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approximately 60 bp of the BGH polyadenylation signal. The resulting plasmid 
represents pVUns-NS. 

pVUns-NS can be summarized as follows: 
Bases 1 to 1881 ofpVUnsA 

5 an additional AGCTT 

then the Met-NS3-NS5B sequence (SEQ. ID. NO. 5) 

then the wt TGA stop 

an additional TCTAGAGCGTTTAAACCCTTAATTAAGG (SEQ. ID. 

NO. 14) 

10 Bases 1912 to 4909 of pVUnsA 

pVUns Plasmid with the NSmut Sequence 

The VI JnsNS3-5A plasmid was modified at the 5' of the NS3 coding 
sequence by addition of a full Kozak sequence. The plasmid (VlJNS3-5Akozak) was 
15 obtained by homologous recombination into the bacterial strain BJ5183, co- 
transforming V1JNS3-5A linearized by AflQ. digestion and a PCR fragment containing 
the proximal part of Intron A, the restriction site Bgin, a full Kozak translation 
initiation sequence and part of the NS3 coding sequence. 

The resulting plasmid (VlJNS3-5Akozak) was linearized with Xba I 
20 digestion and co-transformed into the bacterial strain BJ5183 with a PCR fragment, 
containing approximately 200 bp of NS5A, the NS5B mutated sequence, the strong 
translation termination TAAA and approximately 60 bp of the BGH polyadenylation 
signal. The PCR fragment was obtained by assembling two 22bp-overlapping 
fragments where mutations were introduced by the oligonucleotides used for their 
25 amplification. The resulting plasmid represents pVUns-NSmut. 

pVUns-NSmut can be summarized as follows: 
Bases 1 to 1882 of pVl JnsA 

then the kozak Met-NS3-NS5B(mut) TAAA sequence (SEQ. ID. NO. 2) 
an additional TCTAGA 
30 Bases 1925 to 4909 of pVl JnsA 



pVUns Plasmid with the NSOPTmut Sequence 

The human codon-optimized synthetic gene (NSOPTmut) with 
mutated NS5B to abrogate enzymatic activity, full Kozak translation initiation 
35 sequence and a strong translation termination was digested with BamHI and Sail 
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10 



restriction sites present at the 5' and 3' end of the gene. The gene was then cloned 
into the Bgin and Sail restriction sites present in the polylinker of pVUnsA plasmid, 
generating pVUns-NSOPTmut. 

pVUns-NSOPTmut can be summarized as follows: 
Bases 1 to 1881 of pVUnsA 

an additional C 

then kozak Met-NS3-NS5B(optmut) TAAA sequence (SEQ. ID. NO. 3) 

an additional TTTAAATGTTTAAAC (SEQ. ID. NO. 15) 
Bases 1905 to 4909 of pVUnsA 



Plasmids Characterization 

Expression of HCV NS proteins was tested by transfection of HEK 
293 cells, grown in 10% FCS/DMEM supplemented by L-glutamine (final 4 mM). 
Twenty-four hours before transfection, cells were plated in 6-well 35 mm diameter, to 

1 5 reach 90-95% confluence on the day of transfection. Forty nanograms of plasmid 

DNA (previously assessed as a non-saturating DNA amount) were co-transfected with 
100 ng of pRSV-Luc plasmid containing the luciferase reporter gene under the control 
of Rous sarcoma virus promoter, using the LDPOFECTAMINE 2000 reagent. Cells 
were kept in a CO2 incubator for 48 hours at 37 °C. 

20 Cell extracts were prepared in 1% Triton/TEN buffer. The extracts 

were normalized for Luciferase activity, and run in serial dilution on 10% SDS- 
acrylamide gel. Proteins were transferred on nitrocellulose and assayed with 
antibodies directed against NS3, NS5A and NS5B to assess strength of expression and 
correct proteolytic cleavage. Mock- transfected cells were used as a negative control. 

25 Results from representative experiments testing pVUnsNS, pVUnsNSmut and 
pVUnsNSOPTmut are shown in Figure 12. 

Example 3: Mice Immunization with Plasmid DNA Vectors 

The DNA plasmids pVl Jns-NS, pVl Jns-NSmut and pVl Jns- 
30 NSOPTmut were injected in different mice strains to evaluate their potential to elicit 
anti-HCV immune responses. Two different strains (Balb/C and C57Black6, N=9-10) 
were injected intramuscularly with 25 or 50 \ig of DNA followed by electrical pluses. 
Each animal received two doses at three weeks interval. 

Humoral immune response elicited in C57Black6 mice against the NS3 
35 protein was measured in post dose two sera by ELISA on bacterially expressed NS3 
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protease domain. Antibodies specific for the tested antigen were detected in animals 
immunized with all three vectors with geometric mean titers (GMT) ranging from 
94000 to 133000 (Tables 1-3). 



5 Table 1: pVlins-NS 





GMT 


Mice 
n. 


1 


2 


3 


4 


5 


6 


7 


8 


9 




Titer 


105466 


891980 


78799 


39496 


543542 


182139 


32351 


95028 


67800 


94553 



Table 2: pVlins-NSmut 

10 





GMT 


Mice 
n. 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 




Titer 


202981 


55670 


130786 


49748 


17672 


174958 


44304 


37337 


78182 


193695 


75083 



Table 3: pVlins-NSOPTmut 





GMT 


Mice 
n. 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 




Titer 


310349 


43645 


63496 


82174 


630778 


297259 


66861 


146735 


173506 


77732 


133165 



15 

A T cell response was measured in C57Black6 mice immunized with 
two intramuscular injections at three weeks interval with 25 \ig of plasmid DNA. 
Quantitative ELIspot assay was performed to determine the number of IFNy secreting 
T cells in response to five pools of 20mer peptides overlapping by ten residues 
20 encompassing the NS3-NS5B sequence. Specific CD8+ response was analyzed by the 
same assay using a 20mer peptide encompassing a CD8+ epitope for C57Black6 mice 
(pep 1480). 

Cells secreting EFNy in an antigen specific-manner were detected using 
a standard ELIspot assay. T cell response in C57Black6 mice immunized \yith two 
25 intramuscular injections at three weeks interval with 50 \ig of plasmid DNA, was 
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analyzed by the same ELIspot assay measuring the number of IFNy secreting T cells 
in response to five pools of 20mer peptides overlapping by ten residues encompassing 
the NS3-NS5B sequence. 

Spleen cells were prepared from immunized mice and re-suspended in 
5 RIO medium (RPNH 1640 supplemented with 10% FCS, 2 mM L-Glutamine, 50 
U/ml-50/xg/ml Penicillin/Streptomycin, 10 mM Hepes, 50 /iM 2-mercapto-ethanol). 
Multiscreen 96-well Filtration Plates (Millipore, Cat. No. MAIPS4510, Millipore 
Corporation, 80 Ashby Road Bedford, MA) were coated with purified rat anti-mouse 
INFy antibody (PharMingen, Cat. No. 18181D, PharmiMingen, 10975 Torreyana 
10 Road, San Diego, California 92121-1 1 1 1 USA). After overnight incubation, plates 
were washed with PBS lX/0.005% Tween and blocked with 250 /il/well of R10 
medium. 

Splenocytes from immunized mice were prepared and incubated for 
twenty-four hours in the presence or absence of 10 fiM peptide at a density of 2.5 X 

15 10 5 /well or 5 X 10 5 /well. After extensive washing (PBS lX/0.005% Tween), 
biotinylated rat anti-mouse IFNy antibody (PharMingen, Cat. No. 181 12D, 
PharMingen, 10975 Torreyana Road, San Diego, California 92121-1111 USA) was 
added and incubated overnight at 4° C. For development, streptavidin-AKP 
(PharMingen, Cat. No. 13043E, PharMingen, 10975 Torreyana Road, San Diego, 

20 California 92121-1111 USA) and 1-Step™ NBT-BCIP development solution (Pierce, 
Cat. No. 34042, Pierce, P.O. Box 1 17, Rockford, IL 61 105 USA) were added. 

Pools of 20mer overlapping peptides encompassing the entire sequence 
of the HCV BK strain NS3 to NS5B were used to reveal HCV-specific IFNy-secreting 
T cells. Similarly a single 20mer peptide encompassing a CD8+ epitope for 

25 C57Black6 mice was used to detect CD8 response. Representative data from groups 
of C57Black6 and Balb/C mice (N=9-10) immunized with two injections of 25 or 50 
\ig of plasmid vectors pVUns-NS, pVl Jns-NSmut and pVUns-NSOPTmut are shown 
in Figures 13A and 13B. 

30 Example 4: Immunization of Rhesus Macaques 

Rhesus macaques (N=3) were immunized by intramuscular injection 
with 5mg of plasmid pVl Jns-NSOPTmut in 7.5mg/ml CRL1005, Benzalkonium 
chloride 0.6 mM. Each animal received two doses in the deltoid muscle at 0, and 4 
weeks. 
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CMI was measured at different time points by IFN-y ELISPOT. This 
assay measures HCV antigen-specific CD8+ and CD4+ T lymphocyte responses, and 
can be used for a variety of mammals, such as humans, rhesus monkeys, mice, and 
rats. 

5 The use of a specific peptide or a pool of peptides can simplify antigen 

presentation in CTL cytotoxicity assays, interferon-gamma ELISPOT assays and 
interferon-gamma intracellular staining assays. Peptides based on the amino acid 
sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5A, NS5B) were 
prepared for use in these assays to measure immune responses in HCV DNA and 
10 adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 
The individual peptides are overlapping 20-mers, offset by 10 amino acids. Large 
pools of peptides can be used to detect an overall response to HCV proteins while 
smaller pools and individual peptides may be used to define the epitope specificity of 
a response. 

15 

IFNyELISPOT 

The IFNy-ELISPOT assay provides a quantitative determination of 
HCV-specific T lymphocyte responses. PBMC are serially diluted and placed in 
microplate wells coated with anti-rhesus IFN-y antibody (MD-1 U-Cytech). They are 

20 cultured with a HCV peptide pool for 20 hours, resulting in the restimulation of the 
precursor cells and secretion of EFN-y. The ceils are washed away, leaving the 
secreted EFN bound to the antibody-coated wells in concentrated areas where the cells 
were sitting. The captured DFN is detected with biotinylated anti-rhesus IFN antibody 
(detector Ab U-Cytech) followed by alkaline phosphatase-conjugated streptavidin 

25 (Pharmingen 13043E). The addition of insoluble alkaline phosphatase substrate 

results in dark spots in the wells at the sites where the cells were located, leaving one 
spot for each T cell that secreted IFN-y. 

The number of spots per well is directly related to the precursor 
frequency of antigen-specific T cells. Gamma interferon was selected as the cytokine 

30 visualized in this assay (using species specific anti-gamma interferon monoclonal 
antibodies) because it is the most common, and one of the most abundant cytokines 
synthesized and secreted by activated T lymphocytes. For this assay, the number of 
spot forming cells (SFC) per million PBMCs is determined for samples in the 
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presence and absence (media control) of peptide antigens. Data from Rhesus 
macaques on PBMC from post dose two material are shown in Table 4. 



Table 4 



PVU-NSOPTmut 



Pep pools 



F (NS3p) 



21G 



99C161 



10 



99C166 



170 



G (NS3h) 



592 



229 



H(NS4) 



14 



16 



I (NS5a) 



71 



36 



L (NS5b) 



14 



23 



11 



M (NS5b) 



35 



8 



DMSO 



INFyELlSPOT on PBMC from Rhesus monkeys immunized with two injections of 5 
mg DNA/dose in OPTIVAX/BAK of plasmid pVUns-NSOPTmut. Data are 
expressed as SFC7 106 PBMC. 



10 



15 



20 



Example 5: Construction of Ad6 Pre-Adenovirus Plasmids 

Ad6 pre-adenovirus plasmids were obtained as follows: 

Construction ofpAd.6 E1-E3+ Pre-adenovirus Plasmid 

An Ad6 based pre-adenovirus plasmid which can be used to generate 
first generation Ad6 vectors was constructed either taking advantage of the extensive 
sequence identity (approx. 98%) between Ad5 and Ad6 or containing only Ad6 
regions. Homologous recombination was used to clone wtAd6 sequences into a 
bacterial plasmid. 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 
containing Ad5 and Ad6 regions is illustrated in Figure 10. Cotransformation of BJ 
5183 bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed 
the Ad5 ITR cassette resulted in the circularization of the viral genome by 
homologous recombination. The ITR cassette contains sequences from the right (bp 
33798 to 35935) and left (bp 1 to 341 and bp 3525 to 5767) end of the Ad5 genome 
separated by plasmid sequences containing a bacterial origin of replication and an 
ampicillin resistance gene. The ITR cassette contains a deletion of El sequences from 
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Ad5 342 to 3524. The Ad5 sequences in the ITR cassette provide regions of 
homology with the purified Ad6 viral DNA in which recombination can occur. 

Potential clones were screened by restriction analysis and one clone 
was selected as pAd6El-E3+. This clone was then sequenced in it entirety. pAd6El- 
5 E3+ contains Ad5 sequences from bp 1 to 341 and from bp 3525 to 5548, Ad6 bp 
5542 to 33784, and Ad5 bp 33967 to 35935 (bp numbers refer to the wt sequence for 
both Ad5 and Ad6). pAd6El-E3+ contains the coding sequences for all Ad6 virion 
structural proteins which constitute its serotype specificity. 

A general strategy used to recover pAd6El-E3+ as a bacterial plasmid 

10 containing Ad6 regions is illustrated in Figure 1 1 . Cotransformation of BJ 51 83 

bacteria with purified wt Ad6 viral DNA and a second DNA fragment termed the Ad6 
ITR cassette resulted in the circularization of the viral genome by homologous 
recombination. The ITR cassette contains sequences from the right (bp 35460 to 
35759) and left (bp 1 to 450 and bp 3508 to 3807) end of the Ad6 genome separated 

15 by plasmid sequences containing a bacterial origin of replication and an ampicillin 
resistance gene. These three segments were generated by PCR and cloned 
sequentially into pNEB193, generating pNEBAd6-3 (the ITR cassette). The ITR 
cassette contains a deletion of El sequences from Ad5 451 to 3507. The Ad6 
sequences in the ITR cassette provide regions of homology with the purified Ad6 viral 

20 DNA in which recombination can occur. 

Construction of pAd6 El -£3- pre-adenovirus plasmids 

Ad6 based vectors containing A5 regions and deleted in the E3 region 
were constructed starting with pAd6El-E3+ containing Ad5 regions. A 5322 bp 

25 subfragment of pAd6El-E3+ containing the E3 region (Ad6 bp 2587 1 to 3 1 192) was 
subcloned into pABS.3 generating pABSAd6E3. Three E3 deletions were then made 
in this plasmid generating three new plasmids pABSAd6E3(1.8Kb) (deleted for Ad6 
bp 28602 to 30440), pABSAd6E3(2.3Kb) (deleted for Ad6 bp 28157 to 30437) and 
pABSAd6E3(2.6Kb) (deleted for Ad6 bp 28157 to 30788). Bacterial recombination 

30 was then used to substitute the three E3 deletions back into pAd6El-E3+ generating 
the Ad6 genome plasmids pAd6El-E3- 1.8Kb, pAd6El-E3-2.3Kb and pAd6El-E3- 
2.6Kb. 
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Exam ple 6: Generation of Ad5 Genome Plasmid with the NS Sequence 

A pcDNA3 plasmid (Invitrogen) containing the coding region NS3-NS4A- 
NS4B-NS5A was digested with XmnI and Nrul restriction sites and the DNA fragment 
containing the CMV promoter, the NS3-NS4A-NS4B-NS5A coding sequence and the 
Bovine Growth Hormone (BGH) polyadenylation signal was cloned into the unique EcoxV 
5 restriction site of the shuttle vector pDelElSpa, generating the Sva3-5A vector. 

A pcDNA3 plasmid containing the coding region NS3-NS4A-NS4B- 
NS5A-NS5B was digested with Xmnl and Ecorl (partial digestion), and the DNA 
fragment containing part of NS5A, NS5B gene and the BGH polyadenylation signal 
was cloned into the Sva3-5A vector, digested Ecorl and Bgltt blunted with Klenow, 
10 generating the Sva3-5B vector. 

The Sva3-5B vector was finally digested Sspl and Bstl 1071 restriction 
sites and the DNA fragment containing the expression cassette (CMV promoter, NS3- 
NS4A-NS4B-NS5A-NS5B coding sequence and the BGH polyadenylation signal) 
flanked by adenovirus sequences was co-transformed with pAdSHVO (E1-,E3-) Clal 
15 linearized genome plasmid into the bacterial strain BJ5183, to generate pAdSHVONS. 
pAdSHVO contains Ad5 bp 1 to 341, bp 3525 to 28133 and bp 30818 to 35935. 

Exam ple 7: Generation of Adenovirus Genome Plasmids with the NSmut Sequence 

Adenovirus genome plasmids containing an NS-mut sequence were 
generated in an Ad5 or Ad6 background. The Ad6 background contained Ad5 regions 
20 at bases 1 to 450, 351 1 to 5548 and 33967 to 35935. 

pVl JNS3-5Akozak was digested with BglE and Xbal restriction 
enzymes and the DNA fragment containing the Kozak sequence and the sequence 
coding NS3-NS4A-NS4B-NS5A was cloned into a BglR and Xbal digested 
polypMRKpdelEl shuttle vector. The resulting vector was designated shNS3- 
25 5Akozak. 

PolypMRKpdelEl is a derivative of RKpdelEl(Pac/pIX/pack450) + 
CMVmin+BGHpA(str.) modified by the insertion of a polylinker containing 
recognition sites for Bgffl, Pmel, Swal, Xbal, Sail, into the unique BglH restriction 
site present downstream the CMV promoter. MRKpdelEl(Pac/pIXypack450) + 
30 CMVmin + BGHpA(str.) contains Ad5 sequences from bp 1 to 5792 with a deletion 
of El sequences from bp 451 to 3510. The human CMV promoter and BGH 
polyadenylation signal were inserted into the El deletion in an El parallel orientation 
with a unique Bglll site separating them. 
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The NS5B fragment, mutated to abrogate enzymatic activity and with a 
strong translation termination at the 3' end, was obtained by assembly PCR and 
inserted into the shNS3-5Akozak vector via homologous recombination, generating 
polypMRKpdelElNSmut. In polypMRKpdelElNSmut the NS-mut coding sequence 
5 is under the control of CMV promoter and the BGH polyadenylation signal is present 
downstream. 

The gene expression cassette and the flanking regions which contain 
adenovirus sequences allowing homologous recombination were excised by digestion 
with Pad and itall07I restriction enzymes and co-transformed with either 
10 pAd5HVO (E1-JE3-) or pAd6El-E3-2.6Kb Clal linearized genome plasmids into the 
bacterial strain BJ5183, to generate pAdSHVONSmut and pAd6El-,E3-NSmut, 
respectively. 

pAd6El-E3-2.6Kb contains Ad5 bp 1 to 341 and from bp 3525 to 
5548, Ad6 bp 5542 to 28157 and from bp 30788 to 33784, and Ad5 bp 33967 to 
15 35935 (bp numbers refer to the wt sequence for both Ad5 and Ad6). In both plasmids 
the viral ITR's are joined by plasmid sequences that contain the bacterial origin of 
replication and an ampicillin resistance gene. 

Example 8: Generation of Adenovirus Genome Plasmids with the NSOPTmut 

The human codon-optimized synthetic gene (NSOPTmut) provided by 

20 SEQ. ID. NO. 3 cloned into a pCRBlunt vector (Invitrogen) was digested with BamHl 
and Sail restriction enzymes and cloned into BglU and Sail restriction sites present in 
the shuttle vector polypMRKpdelEl. The resulting clone 
(polypMRKpdelElNSOPTmut) was digested with Pad and Bstl 1071 restriction 
enzymes and co-transformed with either pAdSHVO (E1-,E3-) or pAd6El-E3-2.6Kb 

25 Clal linearized genome plasmids, into the bacterial strain BJ5183, to generate 
pAdSHVONSOPTmut and pAd6El-,E3-NSOPTmut, respectively. 

Example 9: Rescue and Amplification of Adenovirus Vectors 

Adenovectors were rescued in Per.6 cells. Per.C6 were grown in 10% 
30 FCS / DMEM supplemented by L-glutamine (final 4mM), penicillin/streptomycin 
(final 100 IU/ml) and 10 mM MgCl2- After infection, cells were kept in the same 

medium supplemented by 5% horse serum (HS). For viral rescue, 2.5 X 10 6 Per.C6 
were plated in 6 cm 0 Petri dishes. 
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Twenty-four hours after plating, cells were transfected by calcium 
phosphate method with 10 ^ig of the Pac I linearized adenoviral DNA. The DNA 
precipitate was left on the cells for 4 hours. The medium was removed and 5% 
HS/DMEM was added. 
5 Cells were kept in a C0 2 incubator until a cytopathic effect was visible 

(1 week). Cells and supernatant were recovered and subjected to 3X freeze/thawing 
cycles (liquid nitrogen / water bath at 37°C). The lysate was centrifuged at 3000 rpm 
at - 4°C for 20 minutes and the recovered supernatant (corresponding to a cell lysate 
containing virus passed on cells only once; PI) was used, in the amount of 1 ml/ dish, 
10 to infect 80-90% confluent Per.C6 in 10 cm 0 Petri dishes. The infected cells were 
incubated until a cytopathic effect was visible, cells and supernatant recovered and the 
lysate prepared as described above (P2). 

P2 lysate (4 ml) were used to infect 2 X 15 cm 0 Petri dishes. The 
lysate recovered from this infection (P3) was kept in aliquots at -80°C as a stock of 
15 virus to be used as starting point for big viral preparations. In this case, 1 ml of the 
stock was enough to infect 2 X 15 cm 0 Petri dishes and resulting lysate (P4) was used 
for the infection of the Petri dishes devoted to the large scale infection. 

Further amplification was obtained from the P4 lysate which was 
diluted in medium without FCS and used to infect 30 X 15 cm 0 Petri dishes (with 
20 Per.C6 80%-90% confluent) in the amount of 10 ml/dish. Cells were incubated 1 
hour in the C0 2 incubator, mixing gently every 20 minutes. 12 ml / dish of 5% HS / 
DMEM was added and cells were incubated until a cytopathic effect was visible 
(about 48 hours). 

Cells and supernatant were collected and centrifuged at 2K rpm for 20 
minutes at 4°C. The pellet was resuspended in 15 ml of 0.1 M Tris pH=8.0. Cells 
were lysed by 3X freeze/thawing cycles (liquid nitrogen / water bath at 37°C). 150 /il 
of 2 M MgCl2 and 75 /xl of DNAse (10 mg of bovine pancreatic deoxyribonuclease I 

in 10 ml of 20 mM Tris-HCl pH= 7.4, 50 mM NaCl, 1 mM dithiothreitol, 0. 1 mg/ml 
bovine serum albumin, 50% glycerol) were added. After a 1 hour incubation at 37°C 
in a water bath (vortex every 15 minutes) the lysate was centrifuged at 4K rpm for 15 
minutes at 4°C. The recovered supernatant was ready to be applied on CsCl gradient. 
The CsCl gradients were prepared in SW40 ultra-clear tubes as 

follows: 

0.5 ml of 1.5d CsCl 
3 ml of 1.35d CsCl 
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3 ml of 1.25d CsCl 

5-ml/ tube of viral supernatant was applied. 

If necessary, the tubes were topped up with 0.1 M tris-Cl pH=8.0. 
Tubes were centrifuged at 35K rpm for 1 hour at -10°C with rotor SW40. The viral 
5 bands (located at the 1 .25/1.35 interface) were collected using a syringe. 

The virus was transferred into a new SW40 ultraclear tube and 1.35d 
CsCl was added to top the tube up. After centrifugation at 35K rpm for 24 hours at 
10°C in the rotor SW40, the virus was collected in the smallest possible volume and 
dialyzed extensively against buffer A105 (5 mM Tris, 5% sucrose, 75 mM NaCl, 1 
10 mM MgCl2, 0.005% polysorbate 80 pH=8.0). After dialysis, glycerol was added to 

final 10% and the virus was stored in aliquots at - 80°C. 

Example 10: Enhanced Adenovector Rescue 

First generation Ad5 and Ad6 vectors carrying HCV NSOPTmut 

15 transgene were found to be difficult to rescue. A possible block in the rescue process 
might be attributed to an inefficient replication of plasmid DNA that is a sub-optimal 
template for the replication machinery of adenovirus. The absence of the terminal 
protein linked to the 5'ends of the DNA (normally present in the viral DNA), 
associated with the very high G-C content of the transgene inserted in the El region of 

20 the vector, may be causing a substantia] reduction in replication rate of the plasmid- 
derived adenovirus. 

To set up a more efficient and reproducible procedure for rescuing Ad 
vectors, an expression vector (pE2; Figure 19) containing all E2 proteins (polymerase, 
pre-terminal protein and DNA binding protein) as well as E4 orf6 under the control of 

25 tet-inducible promoter was employed. The transfection of pE2 in combination with a 
normal preadeno plasmid in PerC6 and in 293 leads to a strong increase of Ad DNA 
replication and to a more efficient production of complete infectious adenovirus 
particles. 

30 Plasmid Construction 

pE2 is based on the cloning vector pBI (CLONTECH) with the 
addition of two elements to allow episomal replication and selection in cell culture: 
(1) the EBV-OriP (EBV [nt] 7421-8042) region permitting plasmid replication in 
synchrony with the cell cycle when EBNA-1 is expressed and (2) the hygromycin-B 
phosphotransferase (HPH)-resi stance gene allowing a positive selection of 
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transformed cells. The two transcriptional units for the adenoviral genes E2 a and b 
and E4-Orf6 were constructed and assembled in pE2 as described below. 

The Ad5-Polymerase Clal/SphI fragment and the Ad5-pTP 
Acc65/EcoRV fragment were obtained from pVac-Pol and pVac-pTP (Stunnemberg et 
5 al NAR 76:2431-2444, 1988). Both fragments were filled with Klenow and cloned 
into the Sail (filled) and EcoRV sites of pBI, respectively obtaining pBI-Pol/pTP. 

EBV-OriP element from pCEP4 (Invitrogen) was first inserted within 
two chicken (3-globin insulator dimers by cloning it into BamHI site of pJC13-l 
(Chung et al f Cell 74(3J: 505- 14, 1993). HS4-OriP fragment from pJC13-OriP was 

10 then cloned inside pS Almv (a plasmid containing tk-Hygro-B resistance gene 

expression cassette as well as Ad5 replication origin), the ITR's arranged as head-to- 
tail junction, obtained by PCR from pFG140 (Graham, EMBO J. 3:2917-2922, 1984) 
using the following primers: 5'-TCGAATCGATACGCGAACCTACGC-3' (SEQ. 
ID. NO. 16) and 5'-TCGACGTGTCGACTTCGAAGCGCACACCAAAAACGTC-3' 

1 5 (SEQ. ID. NO. 17), thus generating pMVHS40rip. A DNA fragment from 

pMVHS40rip, containing the insulated OriP, Ad5 ITR junction and tk-HygroB 
cassette, was then inserted into pBI-Pol/pTP vector restricted Asel/Aatll generating 
P BI-Pol/pTPHS4 . 

To construct the second transcriptional unit expressing Ad5-Orf6 as 

20 well as Ad5-DBP, E4orf6 (Ad 5 [nt] 33193-34077) obtained by PCR was first 
inserted into pBI vector, generating pBl-Orf6. Subsequently, DBP coding DNA 
sequence (Ad 5 [nt] 22443-24032) was inserted into pBI-Orf6 obtaining the second 
bi-directional Tet-regulated expression vector (pBI-DBP/E4orf6). The original polyA 
signals present in pBI were substituted with BGH and SV40 polyA. 

25 pBI-DBP/E4orf6 was then modified by inserting a DNA fragment 

containing the Adeno5-ITRs arranged in head-to-tail junction plus the hygromicin B 
resistance gene obtained from plasmid pSA-lmv. The new plasmid pBI- 
DBP/E4orf6shuttle was then used as donor plasmid to insert the second tet-regulated 
transcriptional unit into pBI-Pol/pTPHS4 by homologous recombination using £ coli 

30 strain BJ5183 obtaining pE2. 

Cell lines, Transfections and Virus Amplification 

Pei€6 cells were cultured in Dulbecco's modified Eagle's Medium 
(DMEM) plus 10% fetal bovine serum (FBS), 10 mM MgCl 2 , penicillin (100 U/ml), 
35 streptomycin (100 fig/ml) and 2 mM glutamine. 
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All transient transfections were performed using Lipofectamine2000 
(Invitrogen) as described by the manufacturer. 90% confluent PERC.6™ planted in 
6-cm plates were transfected with 3.5 \ig of Ad5/6NSOPTmut pre-adeno plasmids, 
digested with Pad, alone or in combination with 5 fxg pE2 plus 1 jxg pUHD52.1. 
5 pUHD52.1 is the expression vector for the reverse tet transactivator 2 (rtTA2) 
(Urlinger et al t Proc. Natl Acad. Sci. USA. 97(7^:7963-7968, 2000). Upon 
transfection, cells were cultivated in the presence of 1 ng/ml of doxycycline to 
activate pE2 expression. 7 days post-transfection cells were harvested and cell lysate 
was obtained by three cycles of freeze-thaw. Two ml of cell lysate were used to infect 
10 a second 6-cm dish of PerC6. Infected cells were cultivated until a full CPE was 
observed then harvested. The virus was serially passaged five times as described 
above, then purified on CsCl gradient. The DNA structure of the purified virus was 
controlled by endonuclease digestion and agarose gel electrophoresis analysis and 
compared to the original pre-adeno plasmid restriction pattern. 

15 

Example 11: Partial Optimizeation of HCV Polvprotein Encoding Nucleic acid 

Partial optimization of HCV polyprotein encoding nucleic acid was 
performed to facilitate the production of adenovectors containing codons optimized 
for expression in a human host. The overall objective was to provide for increased 

20 expression due to codon optimization, while facilitating the production of an 
adenovector encoding HCV polyprotein. 

Several difficulties were encountered in producing an adenovector 
encoding HCV polyprotein with codons optimized for expression in a human host. An 
adenovector containing an optimized sequence (SEQ. ID. NO. 3) was found to be 

25 more difficult to synthesize and rescue than an adenovector containing a non- 
optimized sequence (SEQ. ID. NO. 2). 

The difficulties in producing an adenovector containing SEQ. ID. NO. 
3 were attributed to a high GC content. A particularly problemetic region was the 
region at about position 3900 of NSOPTmut (SEQ. ID. NO. 3). 

30 Alternative versions of optimized HCV encoding nucleic acid 

sequence were designed to facilitate its use in an adenovector. The alternative 
versions, compared to NSOPTmut, were designed to have a lower overall GC content, 
to reduce/avoid the presence of potentially problematic motifis of consecutive G's or 
C's, while maintaining a high level of codon optimization to allow improved 

35 expression of the encoded polyprotein and the individual cleavage products. 
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A starting point for the generation of a suboptimally codon-optimized 
sequence is the coding region of the NSOPTmut nucleotide sequence (bases 7 to 5961 
of SEQ. ID. NO. 3). Values for codon usage frequencies (normalized to a total of 1.0 
for each amino acid) were taken from the file human_.high.cod available in the 
5 Wisconsin Package Version 10.3 (Accelrys Inc., a wholly owned subsidiary of 
Pharmacopeia, Inc). 

To reduce the local and overall GC content a table defining preferred 
codon substitutions for each amino acid was manually generated. For each amino acid 
the codon having 1) a lower GC content as compared to the most frequent codon and 

10 2) a relativly high observed codon usage frequency (as defined in humanjiigh.cod) 
was choosen as the replacement codon. For example for Arg the codon with the 
highest frequency is CGC. Out of the other five alternative codons encoding Arg 
(CGG, AGG, AGA, CGT, CGA) three (AGG, CGT, CGA) reduce the GC content by 
1 base, one (AGA) by two bases and one (CGG) by 0 bases. Since the AGA codon is 

15 listed in humanjiigh.cod as having a relatively low usage frequency (0. 1), the codon 
substituting CGC was therefore choosen to be AGG with a relative frequency of 0.18. 
Similar criteria were applied in order to establish codon replacements for the other 
amino acids resulting in the list shown in Table 5. Parameters applied in the following 
optimization procedure were determined empirically such that the resulting sequence 

20 maintained a considerably improved codon usage (for each amino acid) and the GC 
content (overall and in form of local stretches of consecutive G's and/or Cs) was 
decreased. 

Two examples of partial optimized HCV encoding sequences are 
provided by SEQ. ID. NO. 10 and SEQ. ID. NO. 11. SEQ. ID. NO. 10 provides a 
25 HCV encoding sequence that is partially optimized throughout. SEQ. ID. NO. 1 1 
provides an HCV encoding sequence fully optimized for codon usage with the 
exception of a region that was partially optimized. 

Codon optimization was performed using the following procedure: 
Step 1) The coding region of the input fully optimized NSOPTmut 
30 sequence was analyzed using a sliding window of 3 codons (9 bases) shifting the 
window by one codon after each cycle. Whenever a stretch containing 5 or more 
consecutive Cs and/or G's was detected in the window the following replacement rule 
was applied: Let N indicate the number of codon replacements previously performed. 
If N is odd replace the middle codon in the window with the codon specified in Table 
35 5, if N is even replace the third terminal codon in the window with the codon 
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specified in a codon optimization table such as humanjiigh.cod. If Leu or Val is 
present at the second or third codon do not apply any replacement in order not to 
introduce Leu or Val codons with very low relative codon usage frequency (see, for 
example, human_high.cod). In the following cycle analysis of the shifted window was 
5 then applied to a sequence containing the replacements of the previous cycle. 

The alternating replacement of the middle and terminal codon in the 3 
codon window was found empirically to give a more satisfying overall maintenance of 
optimized codon usage while also reducing GC content (as judged from the final 
sequence after the procedure). In general, however, the precise replacement strategy 

10 depends on the amino acid sequence encoded by the nucelotide sequence under 
analysis and will have to be determined empirically. 

Step 2) The sequence containing all the codon replacements performed 
during step 1) was then subjected to an additional analysis using a sliding window of 
21 codons (63 bases) in length: according to an adjustable parameter the overall GC 

15 content in the window was determined. If the GC content in the window was higher 
than 70% the following codon replacement strategy was applied: In the window 
replace the codons for the amino acids Asn, Asp, Cys, Glu, His, He, Lys, Phe, Tyr by 
the codons given in Table 5. Restriction of the replacement to this set of amino acids 
was motivated by the fact that a) the replacement codon still has an accetably high 

20 frequency of usage in human_high.cod and b) the average overall human codon usage 
in CUTG for the replacement codon is nearly as high as the most frequent codon. In 
the following cycle analysis of the shifted window is then applied to a sequence 
containing the replacements of the previous cycle. 

The threshold 70% was determined empirically by compromising 

25 between an overall reduction in GC content and maintenance of a high codon 

optimization for the individual amino acids. As in step 1) the precise replacement 
strategy (choice of amino acids and GC content threshold value) will again depend on 
the amino acid sequence encoded by the nucleotide sequence under analysis and will 
have to be determined empirically. 

30 Step 3) The sequence generated by steps 1) and 2) was then manually 

edited and additional codons were changed according to the following criteria: 
Regions still having a GC content higher than 70% over a window of 21 codons were 
examined manually and a few codons were replaced again following the scheme given 
in Table 5. 
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Subsequent steps were performed to provide for useful restriction sites, 
remove possible open reading frames on the complementary strand, to add 
homologous recombinant regions, to add a Kozac signal, and to add a terminator. 
These steps are numbered 4-7 
5 Step 4) The sequence generated in step 3 was examined for the absence 

of certain restriction sites (Bglll, Pmel and Xbal) and presence of only 1 StuI site to 
allow a subsequent cloning strategy using a subset of restriction enzymes. Two sites 
(one for Bglll and one for StuI) were removed from the sequence by replacing codons 
that were part of the respective recognition sites. 

10 Step 5) The sequence generated by steps 1) through 4) was then 

modified according to allow subsequent generation of a modified NSOPTmut 
sequence (by homologous recombination). In the sequence obtained from steps 1) 
through 4) the segment comprising base 3556 to 3755 and the segment comprising 
base 4456 to 4656 were replaced by the corresponding segments from NSOPTmut. 

15 The segment comprising bases 3556 to 4656 of SEQ. ID. NO. 10 can be used to 

replace the problematic region in NSOPTmut (around position 3900) by homologous 
recombination thus creating the variant of NSOPTmut having the sequence of SEQ. 
ID. NO. 11. 

Step 6) Analysis of the sequence generated through steps 1) to 5) 

20 revealed a potential open reading frame spanning nearly the complete fragment on the 
complementary strand. Removal of all codons CTA and TTA (Leu) and TCA (Ser) 
from the sense strand effectively removed all stop codons in one of the reading frames 
on the complementary strand. Although the likelyhood for transcription of this 
complementary strand open reading frame and subsequent translation into protein is 

25 very small, in order to exclude a potential interference with the transcription and 

subsequent translation of the sequence encoded on the sense strand, TCA codons for 
Ser were introduced on the sense approximately every 500 bases. No changes were 
introduced in the segments introduced during step 5) to allow homologous 
recombination. The TCA codon for Ser was preferred over the CTA and TTA codons 

30 for Leu because of the higher relative frequency for TCA (0.05) as compared to CTA 
(0.02) and TTA (0.03) in humanjiigh.cod. In addition, the average human codon 
usage from CUTG favored TCA (0.14 against 0.07 for CTA and TTA). 

Step 7) In a final step GCCACC was added at the 5' end of the 
sequence to generate an optimized internal ribosome entry site (Kozak signal) and a 

35 TAAA stop sgnal was added at the 3\ To maintain the initiation of translation 
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properties of NSsuboptmut the first 8 codons of the coding region were kept identical 
to the NSOPTmut sequence. The resulting sequence was again checked for the 
absence of Bgin, Pmel and Xbal recognition sites and the presence of only 1 StuI site. 

The NSsuboptmut sequence (SEQ. ID. NO. 10) has an overall reduced 
5 GC content (63.5%) as compared to NSOPTmut (70.3%) and maintains a well 
optimized level of codon usage optimization. Nucleotide sequence identity of 
NSsuboptmut is 77.2% with respect to NSmut. 

Table 5: Definition of codon replacements performed during steps 1) and 2). 
10 



Amino Acid 


Most frequent 
codon 


Relative 
frequency 


Reduction in 
GC content 
(bases) 


Replacement 
codon 


Relative 
frequency 


Amino Acids where the replacement codon reduces the codon GC-content by 1 base 


Ala 


GCC 


0.51 




GCT 


0.17 


Aig 


CGC 


0.37 




AGG 


0.18 


Asn 


AAC 


0.78 




AAT 


0.22 | 


Asp 


GAC 


0.75 




GAT 


0.25 


Cys 


TGC 


0.68 




TGT 


0.32 


Glu 


GAG 


0.75 




GAA 


0.25 


Gin 


CAG 


0.88 




CAA 


0.12 


Gly 


GGC 


0.50 




GGA 


0.14 


His 


CAC 


0.79 




CAT 


0.21 


He 


ATC 


0.77 




ATT 


0.18 


Lys 


AAG 


0.82 




AAA 


0.18 


Phc 


TTC 


0.80 




TTT 


0.20 


Pro 


CCC 


0.48 




CCT 


0.19 


Ser 


AGC 


0.34 




TCT 


0.13 


Thr 


ACC 


0.51 




ACA 


0.14 


Tyr 


TAC 


0.74 




TAT 


0.26 


Amino Acids with no alternative codon 


Met 


ATG 


1.00 


0 


ATG 


1.00 


Trp 


TGG 


1.00 


0 


TGG 


1.00 
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Amino Acids where the replacement codon has a very low relative frequency. These amino acids were 




excluded from the replacement procedure 




Leu 


CTG 


0.58 


1 


TTG 


0.06 


Val 


GTG 


0.64 


1 


GTT 


0.07 



Example 12: Virus Characterization 

Adenovectors were characterized by: (a) measuring the physical 
particles/ml; (b) running a TaqMan PCR assay; and (c) checking protein expression 
5 after infection of HeLa cells. 

a) Physical Particles Determination 

CsCl purified virus was diluted 1/10 and 1/100 in 0.1% SDS PBS. As 
a control, buffer A105 was used. These dilutions were incubated 10 minutes at 55°C. 
10 After spinning the tubes briefly, O.D. at 260 nm was measured. The amount of viral 
particles was calculated as follows: 1 OD 260 nm = 1.1 X 10 12 physical particles/ml. 
The results were typically between 5 X 10 n and 1 X 10 12 physical particles /ml. 



b) TaqMan PCR Assay 

15 TaqMan PCR assay was used for adenovectors genome quantification 

(Q-PCR particles/ml). TaqMan PCR assay was performed using the ABI Prism 7700- 
sequence detector. The reaction was performed in a final 50 fil volume in the 
presence of oligonucleotides (at final 200 nM) and probe (at final 200 fiM) specific 
for the adenoviral backbone. The virus was diluted 1/10 in 0.1% SDS PBS and 

20 incubated 10 minutes at 55°C. After spinning the tube briefly, serial 1/10 dilutions (in 
water) were prepared. 10 jil the 10"\ 10" 5 and 10* 7 dilutions were used as templates in 
the PCR assay. 

The amount of particles present in each sample was calculated on the 
basis of a standard curve run in the same experiment. Typically results were between 
25 1 X 10 12 and 3 X 10 12 Q-PCR particles /ml. 



c) Expression of HCV Non-Structural Proteins 

Expression of HCV NS proteins was tested by infection of HeLa cells. 
Cells were plated the day before the infection at 1.5 X 10 6 cells/dish (10 cm 0 Petri 
30 dishes). Different amounts of CsCl purified virus corresponding to m.o.i. of 50, 250 
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10 



and 1250 pp/cell were diluted in medium (FCS free) up to a final volume of 5 ml. The 
diluted virus was added on the cells and incubated for 1 hour at 37°C in a CO2 
incubator (gently mixing every 20 minutes). 5 ml of 5% HS-DMEM was added and 
the cells were incubated at 37°C for 48 hours. 

Cell extracts were prepared in 1% Triton/TEN buffer. The extracts 
were run on 10% SDS-acrylamide gel, blotted on nitrocellulose and assayed with 
antibodies directed against NS3, NS5a and NS5b in order to check the correct 
polyprotein cleavage. Mock-infected cells were used as a negative control. Results 
from representative experiments testing the Ad5-NS, MRKAd5-NSmut, MRKAd6- 
NSmut and MRK Ad6-NS OPTmu t are shown in Figure 14. 



Example 13: Mice Immunization with Adenovectors Encoding Different NS 
Cassettes 

The adenovectors Ad5-NS, MRKAd5-NSmut, MRKAd6-NSmut and 
15 MRKAd6-NSOPTmut were injected in C57Black6 mice strains to evaluate their 
potential to elicit anti-HCV immune responses. Groups of animals (N=9-10) were 
injected intramuscularly with 10^ pp of CsCl purified virus. Each animal received two 
doses at three weeks interval. 

Humoral immune response against the NS3 protein was measured in 
20 post dose two sera from C57Black6 immunized mice by ELISA on bacterially 
expressed NS3 protease domain. Antibodies specific for the tested antigen were 
detected with geometric mean titers (GMT) ranging from 100 to 46000 (Tables 6, 7, 8 
and 9). 



25 



Table 6: Ad5-NS 





GMT 


Mice n. 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 




Titer 


50 


253 


50 


50 


50 


2257 


504 


50 


50 


50 


108 



30 
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Table 7: Ad5-NSmut 







GMT 


Mice 
n. 


11 


12 


13 


14 


15 


16 


17 


18 


19 


20 




Titer 


3162 


78850 


87241 


6796 


12134 


3340 


18473 


13093 


76167 


49593 


23645 



Table 8: MRKAd6-NSmut 



5 









GMT 


Mice 
n. 


21 


22 


23 


24 


25 


26 


27 


28 


29 


30 




Titer 


125626 


39751 


40187 


65834 


60619 


69933 


21555 


49348 


29290 


26859 


46461 



Table 9: MRKAd6-NS0PTmut 





GMT 


Mice n. 


31 


32 


33 


34 


35 


36 


37 




Titer 


25430 


3657 


893 


175 


10442 


49540 


173 


2785 



10 T cell response in C57Black6 mice was analyzed by the quantitative 

ELISPOT assay measuring the number of IFNy secreting T cells in response to five 
pools (named from F to L+M) of 20mer peptides overlapping by ten residues 
encompassing the NS3-NS5B sequence. Specific CD8+ response induced in 
C57Black6 mice was analyzed by the same assay using a 20mer peptide 

15 encompassing a CD8+ epitope for C57Black6 mice (pepl480). Cells secreting IFNy 
in an antigen specific-manner were detected using a standard ELIspot assay. 

Spleen cells, splenocytes and peptides were produced and treated as 
described in Example 3, supra. Representative data from groups of C57Black6 mice 
(N=9-10) immunized with two injections of 10 9 viral particles of vectors Ad5-NS, 

20 MRKAd5-NSmut and MRKAd6-NSmut are shown in Figure 15. 

Example 14: Immunization of Rhesus macaques w ith Adenovectors 

Rhesus macaques (N=3-4) were immunized by intramuscular injection 
of CsCl purified Ad5-NS, MRKAd5-NSmut, MRKAd6-NSmut or MRKAd6- 
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NSOPTmut virus. Each animal received two doses of 10 1 1 or 10 10 vp in the deltoid 

muscle at 0, and 4 weeks. 

CMI was measured at different time points by a) IFN-Y ELISPOT (see 

Example 3, supra), b) IFN-y ICS and c) bulk CTL assays. These assays measure HCV 
5 antigen-specific CD8+ and CD4+ T lymphocyte responses, and can be used for a 

variety of mammals, such as humans, rhesus monkeys, mice, and rats. 

The use of a specific peptide or a pool of peptides can simplify antigen 

presentation in CTL cytotoxicity assays, interferon-gamma ELISPOT assays and 

interferon-gamma intracellular staining assays. Peptides based on the amino acid 
10 sequence of various HCV proteins (core, E2, NS3, NS4A, NS4B, NS5a, NS5b) were 

prepared for use in these assays to measure immune responses in HCV DNA and 

adenovirus vector vaccinated rhesus monkeys, as well as in HCV-infected humans. 

The individual peptides are overlapping 20-mers, offset by 10 amino acids. Large 

pools of peptides can be used to detect an overall response to HCV proteins while 
15 smaller pools and individual peptides may be used to define the epitope specificity of 

a response. 

IFN-ylCS 

For IFN-y ICS, 2 x 106 PBMC in 1 ml RIO (RPMI medium, 
20 supplemented with 10% FCS) were stimulated with peptide pool antigens. Final 

concentration of each peptide was 2 ng/ml. Cells were incubated for 1 hour in a CO2 
incubator at 37°C and then Brefeldin A was added to a final concentration of 10 \ig 
/ml to inhibit the secretion of soluble cytokines. Cells were incubated for additional 
14-16 hours at37°C. 

25 Stimulation was done in the presence of co-stimulatory antibodies: 

CD28 and CD49d (anti-humanCD28 BD340975 and anti-humanCD49d BD340976). 
After incubation, cells were stained with fluorochrome-conjugated antibodies for 
surface antigens: anti-CD3, anti-CD4, anti-CD8 (CD3-APC Biosource APS0301, 
CD4-PE BD345769, CD8-PerCP BD345774). 

30 To detect intracellular cytokines, cells were treated with FACS 

permeabilization buffer 2 (BD340973), 2x final concentration. Once fixed and 
permeabilized, cells were incubated with an antibody against human IFN-y, IFN- 
YFTTC (Biosource AHC4338). 

Cells were resuspended in 1% formaldehyde in PBS and analyzed at 

35 FACS within 24 hours. Four color FACS analysis was performed on a FACSCalibur 
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instrument (Becton Dickinson) equipped with two lasers. Acquisition was done 
gating on the lymphocyte population in the Forward versus Side Scatter plot coupled 
with the CD3, CDS positive populations. At least 30,000 events of the gate were 
taken. The positive cells are expressed as number of IFN-y expressing cells over 10 6 
5 lymphocytes. 

IFN-Y ELISPOT and IFN-y ICS data from immunized monkeys after 
one or two injections of 10 10 or 10 11 vp of the different adenovectors are reported in 
Figures 16A-16D, 17A, and 17B. 

10 Bulk CTL Assays 

A distinguishing effector function of T lymphocytes is the ability of 
subsets of this cell population to directly lyse cells exhibiting appropriate MHC- 
associated antigenic peptides. This cytotoxic activity is most often associated with 
CD8+ T lymphocytes. 

1 5 PBMC samples were infected with recombinant vaccine viruses 

expressing HCV antigens in vitro for approximately 14 days to provide antigen 
restimulation and expansion of memory T cells. Cytotoxicity against autologous B 
cell lines treated with peptide antigen pools was tested. 

The lytic function of the culture is measured as a percentage of specific 

20 lysis resulted from chromium released from target cells during 4 hours incubation 

with CTL effector cells. Specific cytotoxicity is measured and compared to irrelevant 
antigen or excipient-treated B cell lines. This assay is semi-quantitative and is the 
preferred means for determining whether CTL responses were elicited by the vaccine. 
Data after two injections from monkeys immunized with 10 11 vp/dose with 

25 adenovectors Ad5-NS, MRKAdS-NSmut and MRKAd6-NSmut are reported in 
Figures 18A-18F. 

Other embodiments are within the following claims. While several 
embodiments have been shown and described, various modifications may be made 
30 without departing from the spirit and scope of the present invention. 
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WHAT IS CLAIMED IS: 

1. A nucleic acid comprising a nucleotide sequence encoding a 
Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to SEQ ID 

5 NO: 1, provided that said polypeptide has sufficient protease activity to process itself 
to produce an NS5B protein and said NS5B protein is enzymatically inactive. 

2. The nucleic acid of claim 1, wherein said nucleotide sequence 
is substantially similar to the coding sequence of SEQ ID NO: 2. 

10 

3. The nucleic acid of claim 1, wherein said nucleotide sequence 
encodes for the polypeptide of SEQ ID NO: 1. 

4. The nucleic acid of claim 3, wherein said nucleotide sequence 
15 is the coding sequence of either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 

SEQ ID NO: 11. 

5. The nucleic acid of claim 3, wherein said nucleotide sequence 
is the coding sequence of either SEQ ID NO: 2 or SEQ ID NO: 3. 

20 

6. The nucleic acid of any one of claims 1-5, wherein said nucleic 
acid is an expression vector capable of expressing said polypeptide from said 
nucleotide sequence in a human cell. 

25 7. A nucleic acid comprising a gene expression cassette able to 

express a Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide substantially similar to 
SEQ ID NO: 1 in a human cell, provided that said polypeptide can process itself to 
produce an NS5B protein and said NS5B protein is enzymatically inactive, said 
expression cassette comprising: 

30 a) a promoter transcriptionally coupled to a nucleotide sequence 

encoding said polypeptide; 

b) a 5' ribosome binding site functionally coupled to said nucleotide 

sequence, 
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c) a terminator joined to the 3' end of said nucleotide sequence, and 

d) a 3' polyadenylation signal functionally coupled to said nucleotide 

sequence. 

5 8. The nucleic acid of claim 7, wherein said nucleotide sequence 

is substantially similar to either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 
SEQIDNO: 11. 

9. The nucleic acid of claim 8, wherein said nucleic acid is a 

10 shuttle vector further comprising a selectable marker, an origin of replication, a first 
adenovirus homology region and a second adenovirus homology region flanking said 
expression cassette, wherein said first homology region has at least about 100 base 
pairs substantially homologous to at least right end of a wild-type adenovirus region 
from about base pairs 1-425, and said second homology region has at least about 100 

15 base pairs substantially homologous to at least the left end of a wild-type adenovirus 
region from about base pairs 3511-5792 of Ad5 or corresponding region of another 
adenovirus. 

10. The nucleic acid of claim 9, wherein said nucleotide sequence 
20 encodes for a polypeptide of SEQ ID NO: 1. 

1 1 . The nucleic acid of claim 9, wherein said nucleotide sequence 
is SEQIDNO: 2. 

25 12. The nucleic acid of claim 9, wherein said nucleotide sequence 

is either SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

13. The nucleic acid of claim 8, wherein said nucleic acid is a 
plasmid suitable for administration into a human and further comprises a prokaryotic 

30 origin of replication and a gene coding for a selectable marker. 

14. The nucleic acid of claim 13, wherein said nucleotide sequence 
encodes for a polypeptide of SEQ ID NO: 1. 
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15. The nucleic acid of claim 14, wherein said nucleotide sequence 
is the coding sequence of either SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 10, or 
SEQ ID NO: 11. 

5 16. The nucleic acid of claim 14, wherein said nucleotide sequence 

is the coding sequence of SEQ ID NO: 2 or SEQ ED NO: 3. 

17. The nucleic acid of claim 14, wherein said promoter is the 
human intermediate early cytomegalovirus promoter (intron A), said 5* ribosome 

10 binding site consists of SEQ ID NO: 12, and said 3' polyadenylation is the bovine 
growth hormone (BGH) polyadenylation signal. 

18. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising a selectable marker, an origin of replication, 

15 and a recombinant adenovector genome containing an El deletion, an E3 deletion, 
and said expression cassette. 

19. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising a selectable marker, an origin of replication, 
and 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El anti-parallel 
orientation joined to said first region; 

c) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

e) a fourth adenovirus region from about base pair 308 18 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 
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f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region. 

5 20. The nucleic acid of claim 19, wherein said first region 

corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

10 21 . The nucleic acid of claim 20, wherein said promoter is the 

human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 
signal. 

15 22. The nucleic acid of claim 21, wherein said expression cassette 

is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 1 1. 

23. The nucleic acid of claim 19, wherein said first region 

20 corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

24. The nucleic acid of claim 23, wherein said promoter is the 
25 human intermediate early cytomegalovirus promoter, said 5* ribosome binding site 

consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 
signal. 

25. The nucleic acid of claim 24, wherein said expression cassette 
30 is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 1 1. 

26. The nucleic acid of claim 24, wherein said expression cassette 
is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 

35 2 or SEQ ID NO: 3. 
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27. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovirus genome plasmid comprising an origin of replication, a selectable marker, 
and: 

5 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 
10 c) a third adenovirus region from about base pair 5549 to about 

base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to said third region; 
15 e) a fourth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
20 pair 35759 corresponding to Ad6, joined to said fourth region. 

28. The nucleic acid of claim 27, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 

25 corresponds to Ad5. 

29. The nucleic acid of claim 28, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 
signal. 

30. The nucleic acid of claim 27, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 of Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 



65 



WO 03/031588 



PCT/US02/32512 



31 . The nucleic acid of claim 30, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 

5 signal. 

32. The nucleic acid of claim 8, wherein said nucleic acid is a 
adenovector consisting of a nucleotide sequence substantially similar to of SEQ ID 
NO. 4 or a derivative thereof, wherein said derivative thereof has the HCV 

10 polyprotein encoding sequence present in SEQ ID NO: 4 replaced with the HCV 
polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ID 
NO: 11. 

33. The nucleic acid of claim 8, wherein said nucleic acid is an 

15 adenovector having an adenovector genome containing an El deletion, an E3 deletion, 
and said expression cassette 

34. The nucleic acid of claim 8, wherein said nucleic acid is an 
adenovector consisting of: 

20 a) a first adenovirus region from about base pair 1 to about base 

pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in a El parallel or El anti-parallel 
orientation joined to said first region; 

c) a second adenovirus region from about base pair 351 1 to about 
25 base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5541 corresponding to Ad6, joined to said expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

30 e) a fourth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

0 a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

35 pair 35759 corresponding to Ad6, joined to said fourth region. 
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35. The nucleic acid of claim 34, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 
corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 

5 corresponds to Ad5. 

36. The nucleic acid of claim 35, wherein said promoter is the 
human intermediate early cytomegalovirus promoter, said 5' ribosome binding site 
consists of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation 

10 signal. 

37. The nucleic acid of claim 36, wherein said expression cassette 
is in an El anti parallel orientation and said nucleotide sequence is either SEQ ID NO: 
2, SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 11. 

15 

38. The nucleic acid of claim 34, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 
region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

20 

39. The nucleic acid of claim 37, where said promoter is the human 
intermediate early cytomegalovirus promoter, said 5' ribosome binding site consists 
of SEQ ID NO: 12, and said 3' polyadenylation is the BGH polyadenylation signal. 

25 40. The nucleic acid of claim 39, wherein said expression cassette 

is in an El anti parallel orientation and said nucleotide sequence is SEQ ID NO: 2, 
SEQ ID NO: 3, SEQ ID NO: 10, or SEQ ID NO: 1 1. 

41. The nucleic acid of claim 39, wherein said expression cassette 
30 is in an El anti parallel orientation and said nucleotide sequence is SEQ ID NO: 2 or 

SEQ ID NO: 3. 

42. The nucleic acid of claim 8, wherein said nucleic acid is an 
adenovector consisting of: 
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a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 

5 5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28156 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
10 orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
1 5 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 

pair 35759 corresponding to Ad6, joined to said fourth region. 

43. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5, said second region corresponds to Ad5, said third region 

20 corresponds to Ad5, said fourth region corresponds to Ad5, and said fifth region 
corresponds to Ad5. 

44. The nucleic acid of claim 42, wherein said first region 
corresponds to Ad5 or Ad6, said second region corresponds to Ad5 or Ad6, said third 

25 region corresponds to Ad6, said fourth region corresponds to Ad6, and said fifth 
region corresponds to Ad5 or Ad6. 

45. An adenovector consisting of the nucleic acid sequence of SEQ 
ID NO. 4 or a derivative thereof, wherein said derivative thereof has the HCV 

30 polyprotein encoding sequence present in SEQ ID NO: 4 replaced with the HCV 
polyprotein encoding sequence of either SEQ ID NO: 3, SEQ ID NO: 10 or SEQ ID 
NO: 11. 

46. An adenovector produced by a process comprising the steps of: 
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a) producing an adenovirus genome plasmid by homologous 
recombination between the shuttle vector of claim 9 and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
5 a second adenovirus region from about base pair 351 1 to about 

base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 

a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
10 28156 corresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 30818 to about 
base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

a fifth adenovirus region from about base pair 33967 to about 
15 base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region; and 

b) rescuing said adenovector from said adenovirus plasmid. 



47. A cultured recombinant cell comprising the nucleic acid of 



20 claim 6. 



48. A cultured recombinant cell comprising the nucleic acid of any 
one of claims 9-46. 



25 49. A method of making an adenovector comprising the steps of: 

a) producing an adenovirus genome plasmid comprising a gene 
expression cassette by homologous recombination between the nucleic acid of claim 9 
and a nucleic acid comprising; 

a first adenovirus region from about base pair 1 to about base 
30 pair 450 corresponding to either Ad5 or Ad6; 

a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6 Joined to said first region; 
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a third adenovirus region from about base pair 5549 to about 
base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
281 56 corresponding to Ad6, joined to said second region; 

a fourth adenovirus region from about base pair 30818 to about 
5 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, joined to said third region; and 

a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to the fourth region; and 
10 b) rescuing said recombinant adenovirus from said recombinant 

adenovirus plasmid. 

50. A pharmaceutical composition comprising the nucleic acid of 
any one of claims 13-17 and 32-46 and pharmaceutically acceptable carrier. 

15 

51. A method of treating a patient comprising the step of 
administering to said patient an effective amount of the nucleic acid of any one of 
claims 13-17 and 32-46. 

20 52. The method of claim 5 1 , wherein said patient is a human. 

53. The method of claim 52, wherein said patient is not infected 

with HCV. 

25 54. The method of claim 52, wherein said patient is infected with 

HCV. 

55. A recombinant nucleic acid comprising one or more Ad6 
regions and a region not present in Ad6, wherein at least one Ad6 region is selected 

30 from the group consisting of: El A, E1B, E2B, E2A, E4 t LI, L2, L4, and L5. 

56. The recombinant nucleic acid of claim 55, wherein said region 
not present in Ad6, is an expression cassette coding for a polypeptide not found in 
Ad6. 

35 
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57. The recombinant nucleic acid of claim 56, wherein said 
recombinant nucleic acid is an adenovirus vector defective in at least El that is able to 
replicate when El is supplied in trans. 

5 58. The recombinant nucleic acid of claim 57, wherein said vector 

consists of: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 

b) said gene expression cassette in an El parallel or El anti- 
10 parallel orientation joined to said first region; 

c) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said gene expression cassette; 

d) a third adenovirus region from about base pair 5549 to about 
15 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 

28156 corresponding to Ad6, joined to said second region; 

e) an optionally present fourth region from about base pair 28134 
to about base pair 30817 corresponding to Ad5, or from about base pair 28157 to 
about 30789 corresponding to Ad6, joined to said third region; 

20 f) a fifth adenovirus region from about base pair 30818 to about 

base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 
pair 33784 corresponding to Ad6, wherein said fifth region is joined to said fourth 
region if said fourth region is present, or said fifth is joined to said third region if said 
fourth region is not present; and 

g) a sixth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region; 

provided that at least one of said second, third, and fifth regions is 

from Ad6. 

59. The recombinant nucleic acid of claim 57, wherein said vector 

consists of: 

a) a first adenovirus region from about base pair 1 to about base 
pair 450 corresponding to either Ad5 or Ad6; 
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b) a second adenovirus region from about base pair 351 1 to about 
base pair 5548 corresponding to Ad5 or from about base pair 3508 to about base pair 
5541 corresponding to Ad6, joined to said first region; 

c) a third adenovirus region from about base pair 5549 to about 

5 base pair 28133 corresponding to Ad5 or from about base pair 5542 to about base pair 
28 1 56 corresponding to Ad6, joined to said second region; 

d) said gene expression cassette in a E3 parallel or E3 anti-parallel 
orientation joined to said third region; 

e) a fourth adenovirus region from about base pair 30818 to about 
10 base pair 33966 corresponding to Ad5 or from about base pair 30789 to about base 

pair 33784 corresponding to Ad6, joined to said gene expression cassette; and 

f) a fifth adenovirus region from about base pair 33967 to about 
base pair 35935 corresponding to Ad5 or from about base pair 33785 to about base 
pair 35759 corresponding to Ad6, joined to said fourth region; 

15 provided that at least one of said second, third, and fourth regions is 

from Ad6. 
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SLTGRDKNQV EGEVQWSTA TQSFLATCVN 
PITQMYTNVD QDLVGWQAPP GARSLTPCTC 
GDSRGSLLSP RPVSYLKGSS GGPLLCPSGH 
VPVESMETTM RSPVFTDNSS PPAVPQSFQV 
AQGYKVLVLN PSVAATLGFG AYMSKAHGID 
YGKFLADGGC SGGAYDIIIC DECHSTDSTT 
LATATPPGSV TVPHPNIEEV ALSNTGEIPF 
SKKKCDELAA KLSGLGINAV AYYRGLDVSV 
GDFDSVIDCN TCVTQTVDFS LDPTFTIETT 
RGIYRFVTPG ERPSGMFDSS VLCECYDAGC 
PGLPVCQDHL EFWESVFTGL THIDAHFLSQ 
RAQAPPPSWD QMWKCLIRLK PTLHGPTPLL 
MACMSADLEV VTSTWLVGG VLAALAAYCL 
PDREFLYQEF DEMEECASHL PYIEQGMQLA 
AAPWESKWR ALETFWAKHM WNFISGIQYL 
SITSPLTTQS TLLFNILGGW VAAQLAPPSA 
KVLVDILAGY GAGVAGALVA FKVMSGEMPS 
WCAAILRRH VGPGEGAVQW HNRLIAFASR 
QILSSLTITQ LLKRLHQWIN EDCSTPCSGS 
QSKLLPQLPG VPFFSCQRGY KGVWRGDGIM 
IVGPKTCSNT WHGTFPINAY TTGPCTPSPA 
VGDFHYVTGM TTDNVKCPCQ VPAPEFFTEV 
TFQVGLNQYL VGSQLPCEPE PDVAVLTSML 
PSLASSSASQ LSAPSLKATC TTHHVSPDAD 
SENKWVLDS FDPLRAEEDE REVSVPAEIL 
PLLESWKDPD YVPPWHGCP LPPIKAPPIP 
ELATKTFGSS ESSAVDSGTA TALPDQASDD 
GDPDLSDGSW STVSEEASED WCCSMSYTW 
SNSLLRHHNM VYATTSRSAG LRQKKVTFDR 
TVKAKLLSVE EACKLTPPHS AKSKFGYGAK 
LEDTVTPIDT TIMAKNEVFC VQPEKGGRKP 
DWSTLPQW MGSSYGFQYS PGQRVEFLVN 
TVTENDIRVE ESIYQCCDLA PEARQAIKSL 
RRCRASGVLT TSCGNTLTCY LKASAACRAA 
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1701 KLQDCTMLVN AAGLWICES AGTQEDAASL RVFTEAMTRY SAPPGDPPQP 

1751 EYDLELITSC SSNVSVAHDA SGKRVYYLTR DPTTPLARAA WETARHTPVN 

1801 SWLGNIIMYA PTLWARMILM THFFSILLAQ EQLEKALDCQ IYGACYSIEP 

1851 LDLPQIIERL HGLSAFSLHS YSPGEINRVA SCLRKLGVPP LRVWRHRARS 

1901 VRARLLSQGG RAATCGKYLF NWAVKTKLKL TPIPAASQLD LSGWFVAGYS 

1951 GGDIYHSLSR ARPRWFMLCL LLLSVGVGIY LLPNR 
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CCCTTCTATG 


GCAAAGCCAT 
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TCTCATTTTC 


TGTCATTCCA 
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TGTCAGGCCT 


CGGAATCAAC 
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TCCGTCATAC 


CAACTATCGG 


1251 


GATGACGGGC 


TATACGGGCG 


1301 


GTGTCACCCA 


GACAGTCGAC 


1351 


ACGACGACCG 


TGCCTCAAGA 


1401 


GACTGGCAGG 


GGTAGGAGAG 


1451 


GGCCCTCGGG 


CATGTTCGAT 
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GGCTGTGCTT 


GGTACGAGCT 


1551 


GGCCTACCTG 


AACACACCAG 


1601 


TCTGGGAGAG 


TGTCTTCACA 


1651 


TCCCAGACCA 
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GGCCTACTCC CAACAGACGC GGGGCCTACT 
TTACAGGCCG GGACAAGAAC CAGGTCGAGG 
ACCGCAACAC AATCCTTCCT GGCGACCTGC 
CGTTTACCAT GGTGCTGGCT CAAAGACCTT 
TCACCCAGAT GTACACTAAT GTGGACCAGG 
CCCCCCGGGG CGCGTTCCTT GACACCATGC 
TTACTTGGTC ACGAGACATG CTGACGTCAT 
ACAGTAGGGG GAGCCTGCTC TCCCCCAGGC 
TCTTCGGGTG GTCCACTGCT CTGCCCTTCG 
CCGGGCTGCC GTATGCACCC GGGGGGTTGC 
CCGTAGAGTC CATGGAAACT ACTATGCGGT 
TCATCCCCCC CGGCCGTACC GCAGTCATTT 
TCCCACTGGC AGCGGCAAGA GTACTAAAGT 
AAGGGTACAA GGTGCTCGTC CTCAATCCGT 
TTTGGGGCGT ATATGTCTAA GGCACACGGT 
TGGGGTAAGG ACCATTACCA CAGGCGCCCC 
GCAAGTTTCT TGCCGATGGT GGTTGCTCTG 
ATATGTGATG AGTGCCATTC AACTGACTCG 
CACAGTCCTG GACCAAGCGG AGACGGCTGG 
CCACCGCTAC GCCTCCGGGA TCGGTCACCG 
GAGGTGGCCC TGTCTAATAC TGGAGAGATC 
CCCCATTGAA GCCATCAGGG GGGGAAGGCA 
AGAAGAAGTG CGACGAGCTC GCCGCAAAGC 
GCTGTGGCGT ATT AC CGGGG GCTCGATGTG 
AGACGTCGTT GTCGTGGCAA CAGACGCTCT 
ACTTTGACTC AGTGATCGAC TGTAACACAT 
TTCAGCTTGG ATCCCACCTT CACCATTGAG 
CGCAGTGTCG CGCTCGCAGC GGCGGGGTAG 
GCATCTACAG GTTTGTGACT CCGGGAGAAC 
TCCTCGGTCC TGTGTGAGTG CTATGACGCG 
CACCCCCGCC GAGACCTCGG TTAGGTTGCG 
GGTTGCCCGT TTGCCAGGAC CACCTGGAGT 
GGCCTCACCC ACATAGATGC ACACTTCTTG 
AGACAACTTC CCCTACCTGG TAGCATACCA 
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CGCCCTGGTC 


GTCGGGGTCG 
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GTCCGGGAGA 


GGGGGCTGTG 
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TCGCGGGGTA 


ATCATGTTTC 


2751 


CGCAGCGCGT 


GTTACTCAGA 


2801 


TGAAAAGGCT 


CCACCAGTGG 


2851 


GGCTCGTGGC 


TAAGGGATGT 


2901 


CTTCAAGACC 


TGGCTCCAGT 


2951 


CTTTTTTCTC 


GTGCCAACGC 


3001 


ATCATGCAAA 


CCACCTGCCC 


3051 


AAACGGTTCC 


ATGAGGATCG 


3101 


ATGGAACATT 


CCCCATCAAC 


3151 


CCAGCGCCAA 


ACTATTCTAG 


3201 


CGTGGAGGTC 


ACGCGGGTGG 


3251 


CTGACAACGT 


AAAGTGCCCA 


3301 


GAGGTGGACG 


GAGTGCGGTT 


3351 


CCTACGGGAG 


GAGGTTACAT 



CTCAGGCCCC ACCTCCATCA TGGGATCAAA 
CTGAAACCTA CGCTGCACGG GCCAACACCC 
CGTCCAAAAT GAGGTCACCC TCACCCACCC 
CATGCATGTC GGCTGACCTG GAGGTCGTCA 
GGCGGAGTCC TTGCAGCTCT GGCCGCGTAT 
GGTCATTGTG GGTAGGATTA TCTTGTCCGG 
ACAGGGAGTT TCTCTACCAG GAGTTCGATG 
CACCTCCCTT ACATCGAGCA GGGAATGCAG 
GAAAGCGCTC GGGTTACTGC AAACAGCCAC 
CTCCCGTGGT GGAGTCCAAG TGGCGAGCCC 
CACATGTGGA ATTTCATCAG CGGGATACAG 
TCTGCCTGGG AACCCCGCAA TAGCATCATT 
TCACCAGCCC GCTCACCACC CAAAGTACCC 
GGGTGGGTGG CTGCCCAACT CGCCCCCCCC 
GGGCGCCGGC ATCGCCGGTG CGGCTGTTGG 
TGCTTGTGGA CATTCTGGCG GGTTATGGAG 
GTGGCCTTCA AGGTCATGAG CGGCGAGATG 
CAATCTACTT CCTGCCATCC TCTCTCCTGG 
TGTGTGCAGC AATACTGCGT CGACACGTGG 
CAGTGGATGA ACCGGCTGAT AGCGTTCGCC 
CCCCACGCAC TATGTGCCTG AGAGCGACGC 
TCCTCTCCAG CCTTACCATC ACTCAGCTGC 
ATTAATGAAG ACTGCTCCAC ACCGTGTTCC 
TTGGGACTGG ATATGCACGG TGTTGACTGA 
CCAAGCTCCT GCCGCAGCTA CCGGGAGTCC 
GGGTACAAGG GAGTCTGGCG GGGAGACGGC 
ATGTGGAGCA CAGATCACCG GACATGTCAA 
TCGGGCCTAA GACCTGCAGC AACACGTGGC 
GCATACACCA CGGGCCCCTG CACACCCTCT 
GGCGCTGTGG CGGGTGGCCG CTGAGGAGTA 
GGGATTTCCA CTACGTGACG GGCATGACCA 
TGCCAGGTTC CGGCTCCTGA ATTCTTCACG 
GCACAGGTAC GCTCCGGCGT GCAGGCCTCT 
TCCAGGTCGG GCTCAACCAA TACCTGGTTG 
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AAGCCAGCCC 


GCCTTATCGT 
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CCCGAACCGG ATGTAGCAGT GCTCACTTCC 
CATCACAGCA GAAACGGCTA AGCGTAGGTT 
CCTTGGCCAG CTCTTCAGCT AGCCAGTTGT 
ACATGCACTA CCCACCATGT CTCTCCGGAC 
CCTCCTGTGG CGGCAGGAGA TGGGCGGGAA 
AGAACAAGGT GGTAGTCCTG GACTCTTTCG 
GATGAGAGGG AAGTATCCGT TCCGGCGGAG 
GTTCCCCGCA GCGATGCCCA TCTGGGCGCG 
TGTTAGAGTC CTGGAAGGAC CCGGACTACG 
TGCCCGTTGC CACCTATCAA GGCCCCTCCA 
GAGGACGGTT GTCCTAACAG AGTCCTCCGT 
TCGCTACTAA GACCTTCGGC AGCTCCGAAT 
ACGGCGACCG CCCTTCCTGA CCAGGCCTCC 
CGACGTTGAG TCGTACTCCT CCATGCCCCC 
ACCCCGATCT CAGTGACGGG TCTTGGTCTA 
GAGGATGTCG TCTGCTGCTC AATGTCCTAC 
CACGCCATGC GCTGCGGAGG AAAGCAAGCT 
ACTCTTTGCT GCGCCACCAT AACATGGTTT 
GCAGGCCTGC GGCAGAAGAA GGTCACCTTT 
CGACCACTAC CGGGACGTGC TCAAGGAGAT 
TTAAGGCTAA ACTCCTATCC GTAGAGGAAG 
CATTCGGCCA AATCCAAGTT TGGCTATGGG 
ATCCAGCAAG GCCGTTAACC ACATCCACTC 
AAGACACTGT GACACCAATT GACACCACCA 
TTCTGTGTCC AACCAGAGAA AGGAGGCCGT 
ATTCCCAGAT CTGGGAGTCC GTGTATGCGA 
TGGTCTCCAC CCTTCCTCAG GTCGTGATGG 
TACTCTCCTG GGCAGCGAGT CGAGTTCCTG 
GAAAAACCCC ATGGGCTTTT CATATGACAC 
TCACCGAGAA CGACATCCGT GTTGAGGAGT 
TTGGCCCCCG AAGCCAGACA GGCCATAAAA 
TATCGGGGGT CCTCTGACTA ATTCAAAAGG 
GGTGCCGCGC GAGCGGCGTG CTGACGACTA 
TGTTACTTGA AGGCCTCTGC AGCCTGTCGA 



FIG. 2C 



WO 03/031588 



PCT/US02/32512 



6/92 



5101 


GCTGCGAAGC 


TCCAGGACTG 


5151 


CGTTATCTGT 


GAAAGCGCGG 


5201 


TCTTCACGGA 


GGCTATGACT 


5251 


CAACCAGAAT 


ACGACTTGGA 


5301 


GGTCGCCCAC 


GATGCATCAG 


5351 


CCACCACCCC 


CCTCGCACGG 
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GTTAACTCCT 


GGCTAGGCAA 


5451 


AAGGATGATT 


CTGATGACTC 
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AGCCCTGGAC 


5551 


GAGCCACTTG 
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TGGCAAGTAC 


CTCTTCAACT 


5801 


CAATCCCGGC 


TGCGTCCCAG 


5851 


TACAGCGGGG 


GAGACATATA 


5901 
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CACGATGCTC GTGAACGCCG CCGGCCTTGT 
GAACCCAAGA GGACGCGGCG AGCCTACGAG 
AGGTACTCTG CCCCCCCCGG GGACCCGCCC 
GCTGATAACA TCATGTTCCT CCAATGTGTC 
GCAAAAGGGT GTACTACCTC ACCCGTGATC 
GCTGCGTGGG AAACAGCTAG ACACACTCCA 
CATTATCATG TATGCGCCCA CTTTGTGGGC 
ACTTCTTCTC CATCCTTCTA GCACAGGAGC 
TGCCAGATCT ACGGGGCCTG TTACTCCATT 
GATCATTGAA CGACTCCATG GCCTTAGCGC 
CTCCAGGTGA GATCAATAGG GTGGCTTCAT 
CCACCCTTGC GAGTCTGGAG ACATCGGGCC 
ACTGTCCCAG GGGGGGAGGG CCGCCACTTG 
GGGCAGTGAA GACCAAACTC AAACTCACTC 
CTGGACTTGT CCGGCTGGTT CGTTGCTGGT 
TCACAGCCTG TCTCGTGCCC GACCCCGCTG 
TACTTTCTGT AGGGGTAGGC ATCTACCTGC 
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CGCCTACCTG 


AACACCCCCG 
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CGTGTTCACC 


1651 
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CGCCTACAGC CAGCAGACCC GCGGCCTGCT 
TGACCGGCCG CGACAAGAAC CAGGTGGAGG 
ACCGCCACCC AGAGCTTCCT GGCCACCTGC 
CGTGTACCAC GGCGCCGGCA GCAAGACCCT 
TCACCCAGAT GTACACCAAC GTGGACCAGG 
CCCCCCGGCG CCCGCAGCCT GACCCCCTGC 
GTACCTGGTG ACCCGCCACG CCGACGTGAT 
ACAGCCGCGG CAGCCTGCTG AGCCCCCGCC 
AGCAGCGGCG GCCCCCTGCT GTGCCCCAGC 
CCGCGCCGCC GTGTGCACCC GCGGCGTGGC 
CCGTGGAGAG CATGGAGACC ACCATGCGCA 
AGCAGCCCCC CCGCCGTGCC CCAGAGCTTC 
CCCCACCGGC AGCGGCAAGA GCACCAAGGT 
AGGGCTACAA GGTGCTGGTG CTGAACCCCA 
TTCGGCGCCT ACATGAGCAA GGCCCACGGC 
CGGCGTGCGC ACCATCACCA CCGGCGCCCC 
GCAAGTTCCT GGCCGACGGC GGCTGCAGCG 
ATCTGCGACG AGTGCCACAG CACCGACAGC 
CACCGTGCTG GACCAGGCCG AGACCGCCGG 
CCACCGCCAC CCCCCCCGGC AGCGTGACCG 
GAGGTGGCCC TGAGCAACAC CGGCGAGATC 
CCCCATCGAG GCCATCCGCG GCGGCCGCCA 
AGAAGAAGTG CGACGAGCTG GCCGCCAAGC 
GCCGTGGCCT ACTACCGCGG CCTGGACGTG 
CGACGTGGTG GTGGTGGCCA CCGACGCCCT 
ACTTCGACAG CGTGATCGAC TGCAACACCT 
TTCAGCCTGG ACCCCACCTT CACCATCGAG 
CGCCGTGAGC CGCAGCCAGC GCCGCGGCCG 
GCATCTACCG CTTCGTGACC CCCGGCGAGC 
AGCAGCGTGC TGTGCGAGTG CTACGACGCC 
GACCCCCGCC GAGACCAGCG TGCGCCTGCG 
GCCTGCCCGT GTGCCAGGAC CACCTGGAGT 
GGCCTGACCC ACATCGACGC CCACTTCCTG 
CGACAACTTC CCCTACCTGG TGGCCTACCA 



FIG. 3A 



WO 03/031588 



PCT/US02/32512 



8/92 



1701 


GGCCACCGTG 


TGCGCCCGCG 


CCCAGGCCCC 


CCCCCCCAGC 


TGGGACCAGA 


1751 


TGTGGAAGTG 


CCTGATCCGC 


CTGAAGCCCA 


CCCTGCACGG 


CCCCACCCCC 


1801 


CTGCTGTACC 


GCCTGGGCGC 


CGTGCAGAAC 


GAGGTGACCC 


TGACCCACCC 


1851 


CATCACCAAG 


TACATCATGG 


CCTGCATGAG 


CGCCGACCTG 


GAGGTGGTGA 


1901 


CCAGCACCTG 


GGTGCTGGTG 


GGCGGCGTGC 


TGGCCGCCCT 


GGCCGCCTAC 


1951 


TGCCTGACCA 


CCGGCAGCGT 


GGTGATCGTG 


GGCCGCATCA 


TCCTGAGCGG 


2001 


CCGCCCCGCC 


ATCGTGCCCG 


ACCGCGAGTT 


CCTGTACCAG 


GAGTTCGACG 


2051 


AGATGGAGGA 


GTGCGCCAGC 


CACCTGCCCT 


ACATCGAGCA 


GGGCATGCAG 


2101 


CTGGCCGAGC 


AGTTCAAGCA 


GAAGGCCCTG 


GGCCTGCTGC 


AGACCGCCAC 


2151 


CAAGCAGGCC 


GAGGCCGCCG 


CCCCCGTGGT 


GGAGAGCAAG 


TGGCGCGCCC 


2201 


TGGAGACCTT 


CTGGGCCAAG 


CACATGTGGA 


ACTTCATCAG 


CGGCATCCAG 


2251 


TACCTGGCCG 


GCCTGAGCAC 


CCTGCCCGGC 


AACCCCGCCA 


TCGCCAGCCT 


2301 


GATGGCCTTC 


ACCGCCAGCA 


TCACCAGCCC 


CCTGACCACC 


CAGAGCACCC 


2351 


TGCTGTTCAA 


CATCCTGGGC 


GGCTGGGTGG 


CCGCCCAGCT 


GGCCCCCCCC 


2401 


AGCGCCGCCA 


GCGCCTTCGT 


GGGCGCCGGC 


ATCGCCGGCG 


CCGCCGTGGG 


2451 


CAGCATCGGC 


CTGGGCAAGG 


TGCTGGTGGA 


CATCCTGGCC 


GGCTACGGCG 


2501 


CCGGCGTGGC 


CGGCGCCCTG 


GTGGCCTTCA 


AGGTGATGAG 


CGGCGAGATG 


2551 


CCCAGCACCG 


AGGACCTGGT 


GAACCTGCTG 


CCCGCCATCC 


TGAGCCCCGG 


2601 


CGCCCTGGTG 


GTGGGCGTGG 


TGTGCGCCGC 


CATCCTGCGC 


CGCCACGTGG 


2651 


GCCCCGGCGA 


GGGCGCCGTG 


CAGTGGATGA 


ACCGCCTGAT 


CGCCTTCGCC 


2701 


AGCCGCGGCA 


ACCACGTGAG 


CCCCACCCAC 


TACGTGCCCG 


AGAGCGACGC 


2751 


CGCCGCCCGC 


GTGACCCAGA 


TCCTGAGCAG 


CCTGACCATC 


ACCCAGCTGC 


2801 


TGAAGCGCCT 


GCACCAGTGG 


ATCAACGAGG 


ACTGCAGCAC 


CCCCTGCAGC 


2851 


GGCAGCTGGC 


TGCGCGACGT 


GTGGGACTGG 


ATCTGCACCG 


TGCTGACCGA 


2901 


CTTCAAGACC 


TGGCTGCAGA 


GCAAGCTGCT 


GCCCCAGCTG 


CCCGGCGTGC 


2951 


CCTTCTTCAG 


CTGCCAGCGC 


GGCTACAAGG 


GCGTGTGGCG 


CGGCGACGGC 


3001 


ATCATGCAGA 


CCACCTGCCC 


CTGCGGCGCC 


CAGATCACCG 


GCCACGTGAA 


3051 


GAACGGCAGC 


ATGCGCATCG 


TGGGCCCCAA 


GACCTGCAGC 


AACACCTGGC 


3101 


ACGGCACCTT 


CCCCATCAAC 


GCCTACACCA 


CCGGCCCCTG 


CACCCCCAGC 


3151 


CCCGCCCCCA 


ACTACAGCCG 


CGCCCTGTGG 


CGCGTGGCCG 


CCGAGGAGTA 


3201 


CGTGGAGGTG 


ACCCGCGTGG 


GCGACTTCCA 


CTACGTGACC 


GGCATGACCA 


3251 


CCGACAACGT 


GAAGTGCCCC 


TGCCAGGTGC 


CCGCCCCCGA GTTCTTCACC 


3301 


GAGGTGGACG 
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CCCGAGCCCG ACGTGGCCGT GCTGACCAGC 
CATCACCGCC GAGACCGCCA AGCGCCGCCT 
GCCTGGCCAG CAGCAGCGCC AGCCAGCTGA 
ACCTGCACCA CCCACCACGT GAGCCCCGAC 
CCTGCTGTGG CGCCAGGAGA TGGGCGGCAA 
AGAACAAGGT GGTGGTGCTG GACAGCTTCG 
GACGAGCGCG AGGTGAGCGT GCCCGCCGAG 
GTTCCCCGCC GCCATGCCCA TCTGGGCCCG 
TGCTGGAGAG CTGGAAGGAC CCCGACTACG 
TGCCCCCTGC CCCCCATCAA GGCCCCCCCC 
GCGCACCGTG GTGCTGACCG AGAGCAGCGT 
TGGCCACCAA GACCTTCGGC AGCAGCGAGA 
ACCGCCACCG CCCTGCCCGA CCAGGCCAGC 
CGACGTGGAG AGCTACAGCA GCATGCCCCC 
ACCCCGACCT GAGCGACGGC AGCTGGAGCA 
GAGGACGTGG TGTGCTGCAG CATGAGCTAC 
CACCCCCTGC GCCGCCGAGG AGAGCAAGCT 
ACAGCCTGCT GCGCCACCAC AACATGGTGT 
GCCGGCCTGC GCCAGAAGAA GGTGACCTTC 
CGACCACTAC CGCGACGTGC TGAAGGAGAT 
TGAAGGCCAA GCTGCTGAGC GTGGAGGAGG 
CACAGCGCCA AGAGCAAGTT CGGCTACGGC 
GAGCAGCAAG GCCGTGAACC ACATCCACAG 
AGGACACCGT GACCCCCATC GACACCACCA 
TTCTGCGTGC AGCCCGAGAA GGGCGGCCGC 
GTTCCCCGAC CTGGGCGTGC GCGTGTGCGA 
TGGTGAGCAC CCTGCCCCAG GTGGTGATGG 
TACAGCCCCG GCCAGCGCGT GGAGTTCCTG 
GAAGAACCCC ATGGGCTTCA GCTACGACAC 
TGACCGAGAA CGACATCCGC GTGGAGGAGA 
CTGGCCCCCG AGGCCCGCCA GGCCATCAAG 
CATCGGCGGC CCCCTGACCA ACAGCAAGGG 
GCTGCCGCGC CAGCGGCGTG CTGACCACCA 
TGCTACCTGA AGGCCAGCGC CGCCTGCCGC 
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CACCATGCTG GTGAACGCCG CCGGCCTGGT 
GCACCCAGGA GGACGCCGCC AGCCTGCGCG 
CGCTACAGCG CCCCCCCCGG CGACCCCCCC 
GCTGATCACC AGCTGCAGCA GCAACGTGAG 
GCAAGCGCGT GTACTACCTG ACCCGCGACC 
GCCGCCTGGG AGACCGCCCG CCACACCCCC 
CATCATCATG TACGCCCCCA CCCTGTGGGC 
ACTTCTTCAG CATCCTGCTG GCCCAGGAGC 
TGCCAGATCT ACGGCGCCTG CTACAGCATC 
GATCATCGAG CGCCTGCACG GCCTGAGCGC 
GCCCCGGCGA GATCAACCGC GTGGCCAGCT 
CCCCCCCTGC GCGTGTGGCG CCACCGCGCC 
GCTGAGCCAG GGCGGCCGCG CCGCCACCTG 
GGGCCGTGAA GACCAAGCTG AAGCTGACCC 
CTGGACCTGA GCGGCTGGTT CGTGGCCGGC 
CCACAGCCTG AGCCGCGCCC GCCCCCGCTG 
TGCTGAGCGT GGGCGTGGGC ATCTACCTGC 
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1 catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 
61 ttgtgacgtg gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 
121 gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 
181 gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 
241 taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 
3 01 agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 
361 gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 
421 cgggtcaaag ttggcgtttt attattatag gcggccgcga tccattgcat acgttgtatc 
481 catatcataa tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt 
541 gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 
601 tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 
661 cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 
721 attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 
7 81 atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 
841 atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 
901 tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 
9 61 actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 
1021 aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 
1081 gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac cgtcagatcg 
1141 cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc 
1201 tccgcggccg ggaacggtgc attggaacgc ggattccccg tgccaagagt gagatctgcc 

12 61 accatggcgc ccatcacggc ctactcccaa cagacgcggg gcctacttgg ttgcatcatc 

13 21 actagcctta caggccggga caagaaccag gtcgagggag aggttcaggt ggtttccacc 
13 81 gcaacacaat ccttcctggc gacctgcgtc aacggcgtgt gttggaccgt ttaccatggt 
1441 gctggctcaa agaccttagc cggcccaaag gggccaatca cccagatgta cactaatgtg 
1501 gaccaggacc tcgtcggctg gcaggcgccc cccggggcgc gttccttgac accatgcacc 
1561 tgtggcagct cagaccttta cttggtcacg agacatgctg acgtcattcc ggtgcgccgg 
1621 cggggcgaca gtagggggag cctgctctcc cccaggcctg tctcctactt gaagggctct 
1681 tcgggtggtc cactgctctg cccttcgggg cacgctgtgg gcatcttccg ggctgccgta 
1741 tgcacccggg gggttgcgaa ggcggtggac tttgtgcccg tagagtccat ggaaactact 
1801 atgcggtctc cggtcttcac ggacaactca tcccccccgg ccgtaccgca gtcatttcaa 
1861 gtggcccacc tacacgctcc cactggcagc ggcaagagta ctaaagtgcc ggctgcatat 
1921 gcagcccaag ggtacaaggt gctcgtcctc aatccgtccg ttgccgctac cttagggttt 
1981 ggggcgtata tgtctaaggc acacggtatt gaccccaaca tcagaactgg ggtaaggacc 
2041 attaccacag gcgcccccgt cacatactct acctatggca agtttcttgc cgatggtggt 
2101 tgctctgggg gcgcttatga catcataata tgtgatgagt gccattcaac tgactcgact 
2161 acaatcttgg gcatcggcac agtcctggac caagcggaga cggctggagc gcggcttgtc 
2221 gtgctcgcca ccgctacgcc tccgggatcg gtcaccgtgc cacacccaaa catcgaggag 
22 81 gtggccctgt ctaatactgg agagatcccc ttctatggca aagccatccc cattgaagcc 
2341 atcagggggg gaaggcatct cattttctgt cattccaaga agaagtgcga cgagctcgcc 
2401 gcaaagctgt caggcctcgg aatcaacgct gtggcgtatt accgggggct cgatgtgtcc 
2461 gtcataccaa ctatcggaga cgtcgttgtc gtggcaacag acgctctgat gacgggctat 
2521 acgggcgact ttgactcagt gatcgactgt aacacatgtg tcacccagac agtcgacttc 
2581 agcttggatc ccaccttcac cattgagacg acgaccgtgc ctcaagacgc agtgtcgcgc 
2641 tcgcagcggc ggggtaggac tggcaggggt aggagaggca tctacaggtt tgtgactccg 
2701 ggagaacggc cctcgggcat gttcgattcc tcggtcctgt gtgagtgcta tgacgcgggc 
27 61 tgtgcttggt acgagctcac ccccgccgag acctcggtta ggttgcgggc ctacctgaac 
2821 acaccagggt tgcccgtttg ccaggaccac ctggagttct gggagagtgt cttcacaggc 
2881 ctcacccaca tagatgcaca cttcttgtcc cagaccaagc aggcaggaga caacttcccc 
2941 tacctggtag cataccaagc cacggtgtgc gccagggctc aggccccacc tccatcatgg 
3001 gatcaaatgt ggaagtgtct catacggctg aaacctacgc tgcacgggcc aacacccttg 
3061 ctgtacaggc tgggagccgt ccaaaatgag gtcaccctca cccaccccat aaccaaatac 
3121 atcatggcat gcatgtcggc tgacctggag gtcgtcacta gcacctgggt gctggtgggc 
3181 ggagtccttg cagctctggc cgcgtattgc ctgacaacag gcagtgtggt cattgtgggt 
3241 aggattatct tgtccgggag gccggctatt gttcccgaca gggagtttct ctaccaggag 
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3361 gccgagcaat tcaagcagaa agcgctcggg ttactgcaaa cagccaccaa acaagcggag 
3421 gctgctgctc ccgtggtgga gtccaagtgg cgagcccttg agacattctg ggcgaagcac 
3481 atgtggaatt tcatcagcgg gatacagtac ttagcaggct tatccactct gcctgggaac 
3541 cccgcaatag catcattgat ggcattcaca gcctctatca ccagcccgct caccacccaa 
3601 agtaccctcc tgtttaacat cttggggggg tgggtggctg cccaactcgc cccccccagc 
3661 gccgcttcgg ctttcgtggg cgccggcatc gccggtgcgg ctgttggcag cataggcctt 
3721 gggaaggtgc ttgtggacat tctggcgggt tatggagcag gagtggccgg cgcgctcgtg 
37 81 gccttcaagg tcatgagcgg cgagatgccc tccaccgagg acctggtcaa tctacttcct 
3841 gccatcctct ctcctggcgc cctggtcgtc ggggtcgtgt gtgcagcaat actgcgtcga 
3901 cacgtgggtc cgggagaggg ggctgtgcag tggatgaacc ggctgatagc gttcgcctcg 
3961 cggggtaatc atgtttcccc cacgcactat gtgcctgaga gcgacgccgc agcgcgtgtt 
4021 actcagatcc tctccagcct taccatcact cagctgctga aaaggctcca ccagtggatt 
4081 aatgaagact gctccacacc gtgttccggc tcgtggctaa gggatgtttg ggactggata 
4141 tgcacggtgt tgactgactt caagacctgg ctccagtcca agctcctgcc gcagctaccg 
4201 ggagtccctt ttttctcgtg ccaacgcggg tacaagggag tctggcgggg agacggcatc 
4261 atgcaaacca cctgcccatg tggagcacag atcaccggac atgtcaaaaa cggttccatg 
4321 aggatcgtcg ggcctaagac ctgcagcaac acgtggcatg gaacattccc catcaacgca 
4381 tacaccacgg gcccctgcac accctctcca gcgccaaact attctagggc gctgtggcgg 
4441 gtggccgctg aggagtacgt ggaggtcacg cgggtggggg atttccacta cgtgacgggc 
4501 atgaccactg acaacgtaaa gtgcccatgc caggttccgg ctcctgaatt cttcacggag 
4561 gtggacggag tgcggttgca caggtacgct ccggcgtgca ggcctctcct acgggaggag 
4621 gttacattcc aggtcgggct caaccaatac ctggttgggt cacagctacc atgcgagccc 
4681 gaaccggatg tagcagtgct cacttccatg ctcaccgacc cctcccacat cacagcagaa 
4741 acggctaagc gtaggttggc cagggggtct cccccctcct tggccagctc ttcagctagc 
4801 cagttgtctg cgccttcctt gaaggcgaca tgcactaccc accatgtctc tccggacgct 
4861 gacctcatcg aggccaacct cctgtggcgg caggagatgg gcgggaacat cacccgcgtg 
4921 gagtcggaga acaaggtggt agtcctggac tctttcgacc cgcttcgagc ggaggaggat 
4981 gagagggaag tatccgttcc ggcggagatc ctgcggaaat ccaagaagtt ccccgcagcg 
5041 atgcccatct gggcgcgccc ggattacaac cctccactgt tagagtcctg gaaggacccg 
5101 gactacgtcc ctccggtggt gcacgggtgc ccgttgccac ctatcaaggc ccctccaata 
5161 ccacctccac ggagaaagag gacggttgtc ctaacagagt cctccgtgtc ttctgcctta 
5221 gcggagctcg ctactaagac cttcggcagc tccgaatcat cggccgtcga cagcggcacg 
5281 gcgaccgccc ttcctgacca ggcctccgac gacggtgaca aaggatccga cgttgagtcg 
5341 tactcctcca tgccccccct tgagggggaa ccgggggacc ccgatctcag tgacgggtct 
5401 tggtctaccg tgagcgagga agctagtgag gatgtcgtct gctgctcaat gtcctacaca 
5461 tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 
5521 ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca 
5581 ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 
5641 gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 
5701 gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 
5761 aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 
5821 ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 
5881 tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 
5941 ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 
6001 gtgatgggct cctcatacgg attccagtac tctcctgggc agcgagtcga gttcctggtg 
6061 aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 
6121 tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 
6181 gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 
6241 ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 
6301 acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct 
6361 gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 
6421 agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 
6481 tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 
6541 tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 
6601 cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 
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6661 aactcctggc taggcaacat tatcatgtat gcgcccactt tgtgggcaag gatgattctg 
6721 atgactcact tcttctccat ccttctagca caggagcaac ttgaaaaagc cctggactgc 
6781 cagatctacg gggcctgtta ctccattgag ccacttgacc tacctcagat cattgaacga 
6841 ctccatggcc ttagcgcatt ttcactccat agttactctc caggtgagat caatagggtg 
6901 gcttcatgcc tcaggaaact tggggtacca cccttgcgag tctggagaca tcgggccagg 
6961 agcgtccgcg ctaggctact gtcccagggg gggagggccg ccacttgtgg caagtacctc 
7021 ttcaactggg cagtgaagac caaactcaaa ctcactccaa tcccggctgc gtcccagctg 
7081 gacttgtccg gctggttcgt tgctggttac agcgggggag acatatatca cagcctgtct 
7141 cgtgcccgac cccgctggtt catgctgtgc ctactcctac tttctgtagg ggtaggcatc 
7201 tacctgctcc ccaaccggta aatctagagc tgtgccttct agttgccagc catctgttgt 
7261 ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta 
7321 ataaaatgag gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg 
7381 ggtggggcag gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc 
7441 ggtgggctct atggccgatc ggcgcgccgt actgaaatgt gtgggcgtgg cttaagggtg 
7501 ggaaagaata tataaggtgg gggtcttatg tagttttgta tctgttttgc agcagccgcc 
7561 gccgccatga gcaccaactc gtttgatgga agcattgtga gctcatattt gacaacgcgc 
7621 atgcccccat gggccggggt gcgtcagaat gtgatgggct ccagcattga tggtcgcccc 
7681 gtcctgcccg caaactctac taccttgacc tacgagaccg tgtctggaac gccgttggag 
7741 actgcagcct ccgccgccgc ttcagccgct gcagccaccg cccgcgggat tgtgactgac 
7801 tttgctttcc tgagcccgct tgcaagcagt gcagcttccc gttcatccgc ccgcgatgac 
7861 aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa tgtcgtttct 
7921 cagcagctgt tggatctgcg ccagcaggtt tctgccctga aggcttcctc ccctcccaat 
7981 gcggtttaaa acataaataa aaaaccagac tctgtttgga tttggatcaa gcaagtgtct 
8041 tgctgtcttt atttaggggt tttgcgcgcg cggtaggccc gggaccagcg gtctcggtcg 
8101 ttgagggtcc tgtgtatttt ttccaggacg tggtaaaggt gactctggat gttcagatac 
8161 atgggcataa gcccgtctct ggggtggagg tagcaccact gcagagcttc atgctgcggg 
8221 gtggtgttgt agatgatcca gtcgtagcag gagcgctggg cgtggtgcct aaaaatgtct 
8281 ttcagtagca agctgattgc caggggcagg cccttggtgt aagtgtttac aaagcggtta 
8341 agctgggatg ggtgcatacg tggggatatg agatgcatct tggactgtat ttttaggttg 
8401 gctatgttcc cagccatatc cctccgggga ttcatgttgt gcagaaccac cagcacagtg 
8461 tatccggtgc acttgggaaa tttgtcatgt agcttagaag gaaatgcgtg gaagaacttg 
8521 gagacgccct tgtgacctcc aagattttcc atgcattcgt ccataatgat ggcaatgggc 
8581 ccacgggcgg cggcctgggc gaagatattt ctgggatcac taacgtcata gttgtgttcc 
8641 aggatgagat cgtcataggc catttttaca aagcgcgggc ggagggtgcc agactgcggt 
8701 ataatggttc catccggccc aggggcgtag ttaccctcac agatttgcat ttcccacgct 
8761 ttgagttcag atggggggat catgtctacc tgcggggcga tgaagaaaac ggtttccggg 
8821 gtaggggaga tcagctggga agaaagcagg ttcctgagca gctgcgactt accgcagccg 
8881 gtgggcccgt aaatcacacc tattaccggc tgcaactggt agttaagaga gctgcagctg 
8941 ccgtcatccc tgagcagggg ggccacttcg ttaagcatgt ccctgactcg catgttttcc 
9001 ctgaccaaat ccgccagaag gcgctcgccg cccagcgata gcagttcttg caaggaagca 
9061 aagtttttca acggtttgag accgtccgcc gtaggcatgc ttttgagcgt ttgaccaagc 
9121 agttccaggc ggtcccacag ctcggtcacc tgctctacgg catctcgatc cagcatatct 
9181 cctcgtttcg cgggttgggg cggctttcgc tgtacggcag tagtcggtgc tcgtccagac 
9241 gggccagggt catgtctttc cacgggcgca gggtcctcgt cagcgtagtc tgggtcacgg 
9301 tgaaggggtg cgctccgggc tgcgcgctgg ccagggtgcg cttgaggctg gtcctgctgg 
9361 tgctgaagcg ctgccggtct tcgccctgcg cgtcggccag gtagcatttg accatggtgt 
9421 catagtccag cccctccgcg gcgtggccct tggcgcgcag cttgcccttg gaggaggcgc 
9481 cgcacgaggg gcagtgcaga cttttgaggg cgtagagctt gggcgcgaga aataccgatt 
9541 ccggggagta ggcatccgcg ccgcaggccc cgcagacggt ctcgcattcc acgagccagg 
9601 tgagctctgg ccgttcgggg tcaaaaacca ggtttccccc atgctttttg atgcgtttct 
9661 tacctctggt ttccatgagc cggtgtccac gctcggtgac gaaaaggctg tccgtgtccc 
9721 cgtatacaga cttgagaggc ctgtcctcga gcggtgttcc gcggtcctcc tcgtatagaa 
9781 actcggacca ctctgagacg aaggctcgcg tccaggccag cacgaaggag gctaagtggg 
9841 aggggtagcg gtcgttgtcc actagggggt ccactcgctc cagggtgtga agacacatgt 
9901 cgccctcttc ggcatcaagg aaggtgattg gtttataggt gtaggccacg tgaccgggtg 
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9961 ttcctgaagg ggggctataa aagggggtgg gggcgcgttc gtcctcactc tcttccgcat 
10021 cgctgtctgc gagggccagc tgttggggtg agtactccct ctcaaaagcg ggcatgactt 
10081 ctgcgctaag attgtcagtt tccaaaaacg aggaggattt gatattcacc tggcccgcgg 
10141 tgatgccttt gagggtggcc gcgtccatct ggtcagaaaa gacaatcttt ttgttgtcaa 
10201 gcttggtggc aaacgacccg tagagggcgt tggacagcaa cttggcgatg gagcgcaggg 
10261 tttggttttt gtcgcgatcg gcgcgctcct tggccgcgat gtttagctgc acgtattcgc 
10321 gcgcaacgca ccgccattcg ggaaagacgg tggtgcgctc gtcgggcact aggtgcacgc 
10381 gccaaccgcg gttgtgcagg gtgacaaggt caacgctggt ggctacctct ccgcgtaggc 
10441 gctcgttggt ccagcagagg cggccgccct tgcgcgagca gaatggcggt agtgggtcta 
10501 gctgcgtctc gtccgggggg tctgcgtcca cggtaaagac cccgggcagc aggcgcgcgt 
10561 cgaagtagtc tatcttgcat ccttgcaagt ctagcgcctg ctgccatgcg cgggcggcaa 
10621 gcgcgcgctc gtatgggttg agtgggggac cccatggcat ggggtgggtg agcgcggagg 
10681 cgtacatgcc gcaaatgtcg taaacgtaga ggggctctct gagtattcca agatatgtag 
10741 ggtagcatct tccaccgcgg atgctggcgc gcacgtaatc gtatagttcg tgcgagggag 
10801 cgaggaggtc gggaccgagg ttgctacggg cgggctgctc tgctcggaag actatctgcc 
10861 tgaagatggc atgtgagttg gatgatatgg ttggacgctg gaagacgttg aagctggcgt 
10921 ctgtgagacc taccgcgtca cgcacgaagg aggcgtagga gtcgcgcagc ttgttgacca 
10981 gctcggcggt gacctgcacg tctagggcgc agtagtccag ggtttccttg atgatgtcat 
11041 acttatcctg tccctttttt ttccacagct cgcggttgag gacaaactct tcgcggtctt 
11101 tccagtactc ttggatcgga aacccgtcgg cctccgaacg gtaagagcct agcatgtaga 
11161 actggttgac ggcctggtag gcgcagcatc ccttttctac gggtagcgcg tatgcctgcg 
11221 cggccttccg gagcgaggtg tgggtgagcg caaaggtgtc cctaaccatg actttgaggt 
11281 actggtattt gaagtcagtg tcgtcgcatc cgccctgctc ccagagcaaa aagtccgtgc 
11341 gctttttgga acgcgggttt ggcagggcga aggtgacatc gttgaagagt atctttcccg 
11401 cgcgaggcat aaagttgcgt gtgatgcgga agggtcccgg cacctcggaa cggttgttaa 
11461 ttacctgggc ggcgagcacg atctcgtcaa agccgttgat gttgtggccc acaatgtaaa 
11521 gttccaagaa gcgcgggatg cccttgatgg aaggcaattt tttaagttcc tcgtaggtga 
11581 gctcttcagg ggagctgagc ccgtgctctg aaagggccca gtctgcaaga tgagggttgg 
11641 aagcgacgaa tgagctccac aggtcacggg ccattagcat ttgcaggtgg tcgcgaaagg 
11701 tcctaaactg gcgacctatg gccatttttt ctggggtgat gcagtagaag gtaagcgggt 
11761 cttgttccca gcggtcccat ccaaggtccg cggctaggtc tcgcgcggcg gtcactagag 
11821 gctcatctcc gccgaacttc atgaccagca tgaagggcac gagctgcttc ccaaaggccc 
11881 ccatccaagt ataggtctct acatcgtagg tgacaaagag acgctcggtg cgaggatgcg 
11941 agccgatcgg gaagaactgg atctcccgcc accagttgga ggagtggctg ttgatgtggt 
12 001 gaaagtagaa gtccctgcga cgggccgaac actcgtgctg gcttttgtaa aaacgtgcgc 
12 061 agtactggca gcggtgcacg ggctgtacat cctgcacgag gttgacctga cgaccgcgca 
12121 caaggaagca gagtgggaat ttgagcccct cgcctggcgg gtttggctgg tggtcttcta 
12181 cttcggctgc ttgtccttga ccgtctggct gctcgagggg agttacggtg gatcggacca 
12 241 ccacgccgcg cgagcccaaa gtccagatgt ccgcgcgcgg cggtcggagc ttgatgacaa 
12301 catcgcgcag atgggagctg tccatggtct ggagctcccg cggcgtcagg tcaggcggga 
12361 gctcctgcag gtttacctcg catagccggg tcagggcgcg ggctaggtcc aggtgatacc 
12421 tgatttccag gggctggttg gtggcggcgt cgatggcttg caagaggccg catccccgcg 
12481 gcgcgactac ggtaccgcgc ggcgggcggt gggccgcggg ggtgtccttg gatgatgcat 
12541 ctaaaagcgg tgacgcgggc gggcccccgg aggtaggggg ggctcgggac ccgccgggag 

12 601 agggggcagg ggcacgtcgg cgccgcgcgc gggcaggagc tggtgctgcg cgcggaggtt 
12661 gctggcgaac gcgacgacgc ggcggttgat ctcctgaatc tggcgcctct gcgtgaagac 
12721 gacgggcccg gtgagcttga acctgaaaga gagttcgaca gaatcaattt cggtgtcgtt 
12781 gacggcggcc tggcgcaaaa tctcctgcac gtctcctgag ttgtcttgat aggcgatctc 
12841 ggccatgaac tgctcgatct cttcctcctg gagatctccg cgtccggctc gctccacggt 
12901 ggcggcgagg tcgttggaga tgcgggccat gagctgcgag aaggcgttga ggcctccctc 
12961 gttccagacg cggctgtaga ccacgccccc ttcggcatcg cgggcgcgca tgaccacctg 

13 021 cgcgagattg agctccacgt gccgggcgaa gacggcgtag tttcgcaggc gctgaaagag 
13 081 gtagttgagg gtggtggcgg tgtgttctgc cacgaagaag tacataaccc agcgccgcaa 
13141 cgtggattcg ttgatatccc ccaaggcctc aaggcgctcc atggcctcgt agaagtccac 
13 201 ggcgaagttg aaaaactggg agttgcgcgc cgacacggtt aactcctcct ccagaagacg 
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13261 gatgagctcg gcgacagtgt cgcgcacctc gcgctcaaag gctacagggg cctcttcttc 
13321 ttcttcaatc tcctcttcca taagggcctc cccttcttct tcttctggcg gcggtggggg 
13381 aggggggaca cggcggcgac gacggcgcac cgggaggcgg tcgacaaagc gctcgatcat 
13441 ctccccgcgg cgacggcgca tggtctcggt gacggcgcgg ccgttctcgc gggggcgcag 
13501 ttggaagacg ccgcccgtca tgtcccggtt atgggttggc ggggggctgc cgtgcggcag 
13561 ggatacggcg ctaacgatgc atctcaacaa ttgttgtgta ggtactccgc caccgaggga 
13621 cctgagcgag tccgcatcga ccggatcgga aaacctctcg agaaaggcgt ctaaccagtc 
13 681 acagtcgcaa ggtaggctga gcaccgtggc gggcggcagc gggcggcggt cggggttgtt 
13741 tctggcggag gtgctgctga tgatgtaatt aaagtaggcg gtcttgagac ggcggatggt 
13801 cgacagaagc accatgtcct tgggtccggc ctgctgaatg cgcaggcggt cggccatgcc 
138 61 ccaggcttcg ttttgacatc ggcgcaggtc tttgtagtag tcttgcatga gcctttctac 
13921 cggcacttct tcttctcctt cctcttgtcc tgcatctctt gcatctatcg ctgcggcggc 
13981 ggcggagttt ggccgtaggt ggcgccctct tcctcccatg cgtgtgaccc cgaagcccct 
14041 catcggctga agcagggcca ggtcggcgac aacgcgctcg gctaatatgg cctgctgcac 
14101 ctgcgtgagg gtagactgga agtcgtccat gtccacaaag cggtggtatg cgcccgtgtt 
14161 gatggtgtaa gtgcagttgg ccataacgga ccagttaacg gtctggtgac ccggctgcga 
14221 gagctcggtg tacctgagac gcgagtaagc ccttgagtca aagacgtagt cgttgcaagt 
14281 ccgcaccagg tactggtatc ccaccaaaaa gtgcggcggc ggctggcggt agaggggcca 
14341 gcgtagggtg gccggggctc cgggggcgag gtcttccaac ataaggcgat gatatccgta 
14401 gatgtacctg gacatccagg tgatgccggc ggcggtggtg gaggcgcgcg gaaagtcacg 
14461 gacgcggttc cagatgttgc gcagcggcaa aaagtgctcc atggtcggga cgctctggcc 
14521 ggtcaggcgc gcgcagtcgt tgacgctcta gaccgtgcaa aaggagagcc tgtaagcggg 
14581 cactcttccg tggtctggtg gataaattcg caagggtatc atggcggacg accggggttc 
14641 gaaccccgga tccggccgtc cgccgtgatc catgcggtta ccgcccgcgt gtcgaaccca 
14701 ggtgtgcgac gtcagacaac gggggagcgc tccttttggc ttccttccag gcgcggcgga 
14761 tgctgcgcta gcttttttgg ccactggccg cgcgcggcgt aagcggttag gctggaaagc 
14821 gaaagcatta agtggctcgc tccctgtagc cggagggtta ttttccaagg gttgagtcgc 
14881 gggacccccg gttcgagtct cgggccggcc ggactgcggc gaacgggggt ttgcctcccc 
14941 gtcatgcaag accccgcttg caaattcctc cggaaacagg gacgagcccc ttttttgctt 
15001 ttcccagatg catccggtgc tgcggcagat gcgcccccct cctcagcagc ggcaagagca 
15061 agagcagcgg cagacatgca gggcaccctc cccttctcct accgcgtcag gaggggcaac 
15121 atccgcggct gacgcggcgg cagatggtga ttacgaaccc ccgcggcgcc ggacccggca 
15181 ctacttggac ttggaggagg gcgagggcct ggcgcggcta ggagcgccct ctcctgagcg 
15241 acacccaagg gtgcagctga agcgtgacac gcgcgaggcg tacgtgccgc ggcagaacct 
153 01 gtttcgcgac cgcgagggag aggagcccga ggagatgcgg gatcgaaagt tccatgcagg 
153 61 gcgcgagttg cggcatggcc tgaaccgcga gcggttgctg cgcgaggagg actttgagcc 
15421 cgacgcgcgg accgggatta gtcccgcgcg cgcacacgtg gcggccgccg acctggtaac 
15481 cgcgtacgag cagacggtga accaggagat taactttcaa aaaagcttta acaaccacgt 
15541 gcgcacgctt gtggcgcgcg aggaggtggc tataggactg atgcatctgt gggactttgt 
15601 aagcgcgctg gagcaaaacc caaatagcaa gccgctcatg gcgcagctgt tccttatagt 
15661 gcagcacagc agggacaacg aggcattcag ggatgcgctg ctaaacatag tagagcccga 
15721 gggccgctgg ctgctcgatt tgataaacat tctgcagagc atagtggtgc aggagcgcag 
15781 cttgagcctg gctgacaagg tggccgccat taactattcc atgctcagtc tgggcaagtt 
15841 ttacgcccgc aagatatacc atacccctta cgttcccata gacaaggagg taaagatcga 
15901 ggggttctac atgcgcatgg cgctgaaggt gcttaccttg agcgacgacc tgggcgttta 
15961 tcgcaacgag cgcatccaca aggccgtgag cgtgagccgg cggcgcgagc tcagcgaccg 
16021 cgagctgatg cacagcctgc aaagggccct ggctggcacg ggcagcggcg atagagaggc 
16081 cgagtcctac tttgacgcgg gcgctgacct gcgctgggcc ccaagccgac gcgccctgga 
16141 ggcagctggg gccggacctg ggctggcggt ggcacccgcg cgcgctggca acgtcggcgg 

162 01 cgtggaggaa tatgacgagg acgatgagta cgagccagag gacggcgagt actaagcggt 
16261 gatgtttctg atcagatgat gcaagacgca acggacccgg cggtgcgggc ggcgctgcag 
16321 agccagccgt ccggccttaa ctccacggac gactggcgcc aggtcatgga ccgcatcatg 

163 81 tcgctgactg cgcgcaaccc tgacgcgttc cggcagcagc cgcaggccaa ccggctctcc 
16441 gcaattctgg aagcggtggt cccggcgcgc gcaaacccca cgcacgagaa ggtgctggcg 
16501 atcgtaaacg cgctggccga aaacagggcc atccggcccg atgaggccgg cctggtctac 
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16561 gacgcgctgc ttcagcgcgt ggctcgttac aacagcagca acgtgcagac caacctggac 
16621 cggctggtgg gggatgtgcg cgaggccgtg gcgcagcgtg agcgcgcgca gcagcagggc 
16681 aacctgggct ccatggttgc actaaacgcc ttcctgagta cacagcccgc caacgtgccg 
16741 cggggacagg aggactacac caactttgtg agcgcactgc ggctaatggt gactgagaca 
16801 ccgcaaagtg aggtgtatca gtccgggcca gactattttt tccagaccag tagacaaggc 
16861 ctgcagaccg taaacctgag ccaggctttc aagaacttgc aggggctgtg gggggtgcgg 
16921 gctcccacag gcgaccgcgc gaccgtgtct agcttgctga cgcccaactc gcgcctgttg 
16981 ctgctgctaa tagcgccctt cacggacagt ggcagcgtgt cccgggacac atacctaggt 
17041 cacttgctga cactgtaccg cgaggccata ggtcaggcgc atgtggacga gcatactttc 
17101 caggagatta caagtgttag ccgcgcgctg gggcaggagg acacgggcag cctggaggca 
17161 accctgaact acctgctgac caaccggcgg caaaaaatcc cctcgttgca cagtttaaac 
17221 agcgaggagg agcgcatttt gcgctatgtg cagcagagcg tgagccttaa cctgatgcgc 
17281 gacggggtaa cgcccagcgt ggcgctggac atgaccgcgc gcaacatgga accgggcatg 
17341 tatgcctcaa accggccgtt tatcaatcgc ctaatggact acttgcatcg cgcggccgcc 
17401 gtgaaccccg agtatttcac caatgccatc ttgaacccgc actggctacc gccccctggt 
17461 ttctacaccg ggggattcga ggtgcccgag ggtaacgatg gattcctctg ggacgacata 
17521 gacgacagcg tgttttcccc gcaaccgcag accctgctag agttgcaaca acgcgagcag 
17581 gcagaggcgg cgctgcgaaa ggaaagcttc cgcaggccaa gcagcttgtc cgatctaggc 
17641 gctgcggccc cgcggtcaga tgctagtagc ccatttccaa gcttgatagg gtctcttacc 
17701 agcactcgca ccacccgccc gcgcctgctg ggcgaggagg agtacctaaa caactcgctg 
17761 ctgcagccgc agcgcgaaaa gaacctgcct ccggcgtttc ccaacaacgg gatagagagc 
17821 ctagtggaca agatgagtag atggaagacg tatgcgcagg agcacaggga tgtgcccggc 
17881 ccgcgcccgc ccacccgtcg tcaaaggcac gaccgtcagc ggggtctggt gtgggaggac 
17941 gatgactcgg cagacgacag cagcgtcttg gatttgggag ggagtggcaa cccgtttgca 
18001 caccttcgcc ccaggctggg gagaatgttt taaaaaaaag catgatgcaa aataaaaaac 
18061 tcaccaaggc catggcaccg agcgttggtt ttcttgtatt ccccttagta tgcggcgcgc 
18121 ggcgatgtat gaggaaggtc ctcctccctc ctacgagagc gtggtgagcg cggcgccagt 
18181 ggcggcggcg ctgggttcac ccttcgatgc tcccctggac ccgccgttcg tgcctccgcg 
18241 gtacctgcgg cctaccgggg ggagaaacag catccgttac tctgagttgg cacccctatt 
18301 cgacaccacc cgtgtgtacc ttgtggacaa caagtcaacg gatgtggcat ccctgaacta 
18361 ccagaacgac cacagcaact ttctaaccac ggtcattcaa aacaatgact acagcccggg 
18421 ggaggcaagc acacagacca tcaatcttga cgaccggtcg cactggggcg gcgacctgaa 
18481 aaccatcctg cataccaaca tgccaaatgt gaacgagttc atgtttacca ataagtttaa 
18541 ggcgcgggtg atggtgtcgc gctcgcttac taaggacaaa caggtggagc tgaaatacga 
18601 gtgggtggag ttcacgctgc ccgagggcaa ctactccgag accatgacca tagaccttat 
18661 gaacaacgcg atcgtggagc actacttgaa agtgggcagg cagaacgggg ttctggaaag 
18721 cgacatcggg gtaaagtttg acacccgcaa cttcagactg gggtttgacc cagtcactgg 
18781 tcttgtcatg cctggggtat atacaaacga agccttccat ccagacatca ttttgctgcc 
18841 aggatgcggg gtggacttca cccacagccg cctgagcaac ttgttgggca tccgcaagcg 
18901 gcaacccttc caggagggct ttaggatcac ctacgatgac ctggagggtg gtaacattcc 
18961 cgcactgttg gatgtggacg cctaccaggc aagcttgaaa gatgacaccg aacagggcgg 
19021 gggtggcgca ggcggcggca acaacagtgg cagcggcgcg gaagagaact ccaacgcggc 
19081 agctgcggca atgcagccgg tggaggacat gaacgatcat gccattcgcg gcgacacctt 
19141 tgccacacgg gcggaggaga agcgcgctga ggccgaggca gcggccgaag ctgccgcccc 
19201 cgctgcggag gctgcacaac ccgaggtcga gaagcctcag aagaaaccgg tgattaaacc 
19261 cctgacagag gacagcaaga aacgcagtta caacctaata agcaatgaca gcaccttcac 
19321 ccagtaccgc agctggtacc ttgcatacaa ctacggcgac cctcaggccg ggatccgctc 
19381 atggaccctg ctttgcactc ctgacgtaac ctgcggctcg gagcaggtat actggtcgtt 
19441 gcccgacatg atgcaagacc ccgtgacctt ccgctccacg cgccagatca gcaactttcc 
19501 ggtggtgggc gccgagctgt tgcccgtgca ctccaagagc ttctacaacg accaggccgt 
19561 ctactcccag ctcatccgcc agtttacctc tctgacccac gtgttcaatc gctttcccga 
19621 gaaccagatt ttggcgcgcc cgccagcccc caccatcacc accgtcagtg aaaacgttcc 
19681 tgctctcaca gatcacggga cgctaccgct gcgcaacagc atcggaggag tccagcgagt 
19741 gaccattact gacgccagac gccgcacctg cccctacgtt tacaaggccc tgggcatagt 
19801 ctcgccgcgc gtcctatcga gccgcacttt ttgagcaagc atgtccatcc ttatatcgcc 
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19861 cagcaataac acaggctggg gcctgcgctt cccaagcaag atgtttggcg gggccaagaa 
19921 gcgctccgac caacacccag tgcgcgtgcg cgggcactac cgcgcgccct ggggcgcgca 
19981 caaacgcggc cgcactgggc gcaccaccgt cgatgacgcc atcgacgcgg tggtggagga 
20041 ggcgcgcaac tacacgccca cgccgccgcc agtgtccacc gtggacgcgg ccattcagac 
20101 cgtggtgcgc ggagcccggc gctacgctaa aatgaagaga cggcggaggc gcgtagcacg 
20161 tcgccaccgc cgccgacccg gcactgccgc ccaacgcgcg gcggcggccc tgcttaaccg 
20221 cgcacgtcgc accggccgac gggcggccat gcgagccgct cgaaggctgg ccgcgggtat 
20281 tgtcactgtg ccccccaggt ccaggcgacg agcggccgcc gcagcagccg cggccattag 
20341 tgctatgact cagggtcgca ggggcaacgt gtactgggtg cgcgactcgg ttagcggcct 
20401 gcgcgtgccc gtgcgcaccc gccccccgcg caactagatt gcaataaaaa actacttaga 
20461 ctcgtactgt tgtatgtatc cagcggcggc ggcgcgcatc gaagctatgt ccaagcgcaa 
20521 aatcaaagaa gagatgctcc aggtcatcgc gccggagatc tatggccccc cgaagaagga 
20581 agagcaggat tacaagcccc gaaagctaaa gcgggtcaaa aagaaaaaga aagatgatga 
20641 tgatgatgaa cttgacgacg aggtggaact gttgcacgcg accgcgccca ggcgacgggt 
207 01 acagtggaaa ggtcgacgcg taagacgtgt tttgcgaccc ggcaccaccg tagtctttac 
20761 gcccggtgag cgctccaccc gcacctacaa gcgcgtgtat gatgaggtgt acggcgacga 
20821 ggacctgctt gagcaggcca acgagcgcct cggggagttt gcctacggaa agcggcataa 
20881 ggacatgctg gcgttgccgc tggacgaggg caacccaaca cctagcctaa agcccgtgac 
20941 actgcagcag gtgctgcccg cgcttgcacc gtccgaagaa aagcgcggcc taaagcgcga 
21001 gtctggtgac ttggcaccca ccgtgcagct gatggtaccc aagcgtcagc gactggaaga 
21061 tgtcttggaa aaaatgaccg tggagcctgg gctggagccc gaggtccgcg tgcggccaat 
21121 caagcaggtg gcaccgggac tgggcgtgca gaccgtggac gttcagatac ccaccaccag 
21181 tagcactagt attgccactg ccacagaggg catggagaca caaacgtccc cggttgcctc 
21241 ggcggtggca gatgccgcgg tgcaggcggc cgctgcggcc gcgtccaaga cctctacgga 
21301 ggtgcaaacg gacccgtgga tgtttcgtgt ttcagccccc cggcgtccgc gccgttcaag 
21361 gaagtacggc gccgccagcg cgctactgcc cgaatatgcc ctacatcctt ccatcgcgcc 
21421 tacccccggc tatcgtggct acacctaccg ccccagaaga cgagcaacta cccgacgccg 
21481 aaccaccact ggaacccgcc gccgccgtcg ccgtcgccag cccgtgctgg ccccgatttc 
21541 cgtgcgcagg gtggctcgcg aaggaggcag gaccctggtg ctgccaacag cgcgctacca 
21601 ccccagcatc gtttaaaagc cggtctttgt ggttcttgca gatatggccc tcacctgccg 
21661 cctccgtttc ccggtgccgg gattccgagg aagaatgcac cgtaggaggg gcatggccgg 
21721 ccacggcctg acgggcggca tgcgtcgtgc gcaccaccgg cggcggcgcg cgtcgcaccg 
21781 tcgcatgcgc ggcggtatcc tgcccctcct tattccactg atcgccgcgg cgattggcgc 
21841 cgtgcccgga attgcatccg tggccttgca ggcgcagaga cactgattaa aaacaagtta 
21901 catgtggaaa aatcaaaata aaagtctgga ctctcacgct cgcttggtcc tgtaactatt 
21961 ttgtagaatg gaagacatca actttgcgtc actggccccg cgacacggct cgcgcccgtt 
22021 catgggaaac tggcaagata tcggcaccag caatatgagc ggtggcgcct tcagctgggg 
22081 ctcgctgtgg agcggcatta aaaatttcgg ttccgccgtt aagaactatg gcagcaaagc 
22141 ctggaacagc agcacaggcc agatgctgag ggacaagttg aaagagcaaa atttccaaca 
22201 aaaggtggta gatggcctgg cctctggcat tagcggggtg gtggacctgg ccaaccaggc 
22261 agtgcaaaat aagattaaca gtaagcttga tccccgccct cccgtagagg agcctccacc 
22321 ggccgtggag acagtgtctc cagaggggcg tggcgaaaag cgtccgcgac ccgacaggga 
22381 agaaactctg gtgacgcaaa tagacgagcc tccctcgtac gaggaggcac taaagcaagg 
22441 cctgcccacc acccgtccca tcgcgcccat ggctaccgga gtgctgggcc agcacacacc 
22501 cgtaacgctg gacctgcctc cccccgccga cacccagcag aaacctgtgc tgccaggccc 
22561 gtccgccgtt gttgtaaccc gtcctagccg cgcgtccctg cgccgcgccg ccagcggtcc 
22621 gcgatcgttg cggcccgtag ccagtggcaa ctggcaaagc acactgaaca gcatcgtggg 
22681 tttgggggtg caatccctga agcgccgacg atgcttctga tagctaacgt gtcgtatgtg 
22741 tgtcatgtat gcgtccatgt cgccgccaga ggagctgctg agccgccgcg cgcccgcttt 
22801 ccaagatggc taccccttcg atgatgccgc agtggtctta catgcacatc tcgggccagg 
22861 acgcctcgga gtacctgagc cccgggctgg tgcagttcgc ccgcgccacc gagacgtact 
22921 tcagcctgaa taacaagttt agaaacccca cggtggcgcc tacgcacgac gtgaccacag 
22981 accggtctca gcgtttgacg ctgcggttca tccccgtgga ccgcgaggat actgcgtact 
23041 cgtacaaggc gcggttcacc ctagctgtgg gtgataaccg tgtgctagac atggcttcca 
23101 cgtactttga catccgcggc gtgctggaca ggggccctac ttttaagccc tactctggca 
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23161 ctgcctacaa cgcactggcc cccaagggtg cccccaactc gtgcgagtgg gaacaaaatg 
23221 aaactgcaca agtggatgct caagaacttg acgaagagga gaatgaagcc aatgaagctc 
23281 aggcgcgaga acaggaacaa gctaagaaaa cccatgtata tgcccaggct ccactgtccg 
23341 gaataaaaat aactaaagaa ggtctacaaa taggaactgc cgacgccaca gtagcaggtg 
23401 ccggcaaaga aattttcgca gacaaaactt ttcaacctga accacaagta ggagaatctc 
23461 aatggaacga agcggatgcc acagcagctg gtggaagggt tcttaaaaag acaactccca 
23521 tgaaaccctg ctatggctca tacgctagac ccaccaattc caacggcgga cagggcgtta 
23581 tggttgaaca aaatggtaaa ttggaaagtc aagtcgaaat gcaatttttt tccacatcca 
23641 caaatgccac aaatgaagtt aacaatatac aaccaacagt tgtattgtac agcgaagatg 
23701 taaacatgga aactccagat actcatcttt cttataaacc taaaatgggg gataaaaatg 
23761 ccaaagtcat gcttggacaa caagcaatgc caaacagacc aaattacatt gcttttagag 
23821 acaattttat tggtctcatg tattacaaca gcacaggtaa catgggtgtc cttgctggtc 
23881 aggcatcgca gttgaacgct gttgtagatt tgcaagacag aaacacagag ctgtcctacc 
23941 agcttttgct tgattcaatt ggcgacagaa caagatactt ttcaatgtgg aatcaagctg 
24001 ttgacagcta tgatccagat gtcagaatta ttgagaacca tggaactgag gatgagttgc 
24061 caaattattg ctttcctctt ggtggaattg ggattactga cacttttcaa gctgttaaaa 
24121 caactgctgc taacggggac caaggcaata ctacctggca aaaagattca acatttgcag 
24181 aacgcaatga aataggggtg ggaaataact ttgccatgga aattaacctg aatgccaacc 
24241 tatggagaaa tttcctttac tccaatattg cgctgtacct gccagacaag ctaaaataca 
24301 accccaccaa tgtggaaata tctgacaacc ccaacaccta cgactacatg aacaagcgag 
24361 tggtggctcc tgggcttgta gactgctaca ttaaccttgg ggcgcgctgg tctctggact 
24421 acatggacaa cgttaatccc tttaaccacc accgcaatgc gggcctgcgt taccgctcca 
24481 tgttgttggg aaacggccgc tacgtgccct ttcacattca ggtgccccaa aagttttttg 
24541 ccattaaaaa cctcctcctc ctgccaggct catacacata tgaatggaac ttcaggaagg 
24601 atgttaacat ggttctgcag agctctctgg gaaacgacct tagagttgac ggggctagca 
24661 ttaagtttga cagcatttgt ctttacgcca ccttcttccc catggcccac aacacggcct 
24721 ccacgctgga agccatgctc agaaatgaca ccaacgacca gtcctttaat gactaccttt 
24781 ccgccgccaa catgctatat cccatacccg ccaacgccac caacgtgccc atctccatcc 
24841 catcgcgcaa ctgggcagca tttcgcggtt gggccttcac acgcttgaag acaaaggaaa 
24901 ccccttccct gggatcaggc tacgaccctt actacaccta ctctggctcc ataccatacc 
24961 ttgacggaac cttctatctt aatcacacct ttaagaaggt ggccattact tttgactctt 
25021 ctgttagctg gccgggcaac gaccgcctgc ttactcccaa tgagtttgag attaagcgct 
25081 cagttgacgg ggagggctat aacgtagctc agtgcaacat gacaaaggac tggttcctag 
25141 tgcagatgtt ggccaactac aatattggct accagggctt ctacattcca gaaagctaca 
25201 aagaccgcat gtactcgttc ttcagaaact tccagcccat gagccggcaa gtggtggacg 
25261 atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcaggct 
25321 tcgtaggcta cctcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 
25381 acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 
25441 gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 
25501 tgggccaaaa ccttctctac gcaaactccg cccacgcgct agacatgacc tttgaggtgg 
25561 atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 
25621 tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 
25681 gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 
25741 tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 
25801 ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 
25861 cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 
25921 aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta 
25981 ccagtttgag tacgagtcac tcctgcgccg tagcgccatt gcctcttccc ccgaccgctg 
26041 tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcct 
26101 attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 
26161 ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtaca 
26221 gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgcccta 
26281 cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 
26341 gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 
26401 tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct 
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26461 gccgcgcatc gctatgcgcc actggcaggg acacgttgcg atactggtgt ttagtgctcc 
2 6521 acttaaactc aggcacaacc atccgcggca gctcggtgaa gttttcactc cacaggctgc 
2 6581 gcaccatcac caacgcgttt agcaggtcgg gcgccgatat cttgaagtcg cagttggggc 
2 6641 ctccgccctg cgcgcgcgag ttgcgataca cagggttaca gcactggaac actatcagcg 
2 6701 ccgggtggtg cacgctggcc agcacgctct tgtcggagat cagatccgcg tccaggtcct 
2 6761 ccgcgttgct cagggcgaac ggagtcaact ttggtagctg ccttcccaaa aagggtgcat 
26821 gcccaggctt tgagttgcac tcgcaccgta gtggcatcag aaggtgaccg tgcccagtct 
26881 gggcgttagg atacagcgcc tgcatgaaag ccttgatctg cttaaaagcc acctgagcct 
26941 ttgcgccttc agagaagaac atgccgcaag acttgccgga aaactgattg gccggacagg 
27001 ccgcgtcatg cacgcagcac cttgcgtcgg tgttggagat ctgcaccaca tttcggcccc 
27061 accggttctt cacgatcttg gccttgctag actgctcctt cagcgcgcgc tgcccgtttt 
27121 cgctcgtcac atccatttca atcacgtgct ccttatttat cataatgctc ccgtgtagac 
27181 acttaagctc gccttcgatc tcagcgcagc ggtgcagcca caacgcgcag cccgtgggct 
27241 cgtggtgctt gtaggttacc tctgcaaacg actgcaggta cgcctgcagg aatcgcccca 
27301 tcatcgtcac aaaggtcttg ttgctggtga aggtcagctg caacccgcgg tgctcctcgt 
27361 ttagccaggt cttgcatacg gccgccagag cttccacttg gtcaggcagt agcttgaagt 
27421 ttgcctttag atcgttatcc acgtggtact tgtccatcaa cgcgcgcgca gcctccatgc 
27481 ccttctccca cgcagacacg atcggcaggc tcagcgggtt tatcaccgtg ctttcacttt 
27541 ccgcttcact ggactcttcc ttttcctctt gcatccgcat accccgcgcc actgggtcgt 
27601 cttcattcag ccgccgcacc gtgcgcttac ctcccttgcc gtgcttgatt agcaccggtg 
27661 ggttgctgaa acccaccatt tgtagcgcca catcttctct ttcttcctcg ctgtccacga 
27721 tcacctctgg ggatggcggg cgctcgggct tgggagaggg gcgcttcttt ttctttttgg 
277 81 acgcaatggc caaatccgcc gtcgaggtcg atggccgcgg gctgggtgtg cgcggcacca 
27841 gcgcatcttg tgacgagtct tcttcgtcct cggactcgag acgccgcctc agccgctttt 
27901 ttgggggcgc gcggggaggc ggcggcgacg gcgacgggga cgagacgtcc tccatggttg 
27961 gtggacgtcg cgccgcaccg cgtccgcgct cgggggtggt ttcgcgctgc tcctcttccc 
28021 gactggccat ttccttctcc tataggcaga aaaagatcat ggagtcagtc gagaaggagg 
28081 acagcctaac cgcccccttt gagttcgcca ccaccgcctc caccgatgcc gccaacgcgc 
28141 ctaccacctt ccccgtcgag gcacccccgc ttgaggagga ggaagtgatt atcgagcagg 
28201 acccaggttt tgtaagcgaa gacgacgaag atcgctcagt accaacagag gataaaaagc 
282 61 aagaccagga cgacgcagag gcaaacgagg aacaagtcgg gcggggggac caaaggcatg 
28321 gcgactacct agatgtggga gacgacgtgc tgttgaagca tctgcagcgc cagtgcgcca 
28381 ttatctgcga cgcgttgcaa gagcgcagcg atgtgcccct cgccatagcg gatgtcagcc 
28441 ttgcctacga acgccacctg ttctcaccgc gcgtaccccc caaacgccaa gaaaacggca 
28501 catgcgagcc caacccgcgc ctcaacttct accccgtatt tgccgtgcca gaggtgcttg 
28561 ccacctatca catctttttc caaaactgca agatacccct atcctgccgt gccaaccgca 
28621 gccgagcgga caagcagctg gccttgcggc agggcgctgt catacctgat atcgcctcgc 
28681 tcgacgaagt gccaaaaatc tttgagggtc ttggacgcga cgagaagcgc gcggcaaacg 
28741 ctctgcaaca agaaaacagc gaaaatgaaa gtcactgtgg agtgctggtg gaacttgagg 
28801 gtgacaacgc gcgcctagcc gtgctgaaac gcagcatcga ggtcacccac tttgcctacc 
28861 cggcacttaa cctacccccc aaggttatga gcacagtcat gagcgagctg atcgtgcgcc 
28921 gtgcacgacc cctggagagg gatgcaaact tgcaagaaca aaccgaggag ggcctacccg 
28981 cagttggcga tgagcagctg gcgcgctggc ttgagacgcg cgagcctgcc gacttggagg 
29041 agcgacgcaa gctaatgatg gccgcagtgc ttgttaccgt ggagcttgag tgcatgcagc 
29101 ggttctttgc tgacccggag atgcagcgca agctagagga aacgttgcac tacacctttc 
29161 gccagggcta cgtgcgccag gcctgcaaaa tttccaacgt ggagctctgc aacctggtct 
29221 cctaccttgg aattttgcac gaaaaccgcc ttgggcaaaa cgtgcttcat tccacgctca 
29281 agggcgaggc gcgccgcgac tacgtccgcg actgcgttta cttatttctg tgctacacct 
29341 ggcaaacggc catgggcgtg tggcagcagt gcctggagga gcgcaacctg aaggagctgc 
29401 agaagctgct aaagcaaaac ttgaaggacc tatggacggc cttcaacgag cgctccgtgg 
29461 ccgcgcacct ggcggacatt atcttccccg aacgcctgct taaaaccctg caacagggtc 
29521 tgccagactt caccagtcaa agcatgttgc aaaactttag gaactttatc ctagagcgtt 
29581 caggaattct gcccgccacc tgctgtgcgc ttcctagcga ctttgtgccc attaagtacc 
29641 gtgaatgccc tccgccgctt tggggtcact gctaccttct gcagctagcc aactaccttg 
297 01 cctaccactc cgacatcatg gaagacgtga gcggtgacgg cctactggag tgtcactgtc 
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29761 gctgcaacct atgcaccccg caccgctccc tggtctgcaa ttcacaactg cttagcgaaa 
29821 gtcaaattat cggtaccttt gagctgcagg gtccctcgcc tgacgaaaag tccgcggctc 
29881 cggggttgaa actcactccg gggctgtgga cgtcggctta ccttcgcaaa tttgtacctg 
29941 aggactacca cgcccacgag attaggttct acgaagacca atcccgcccg ccaaatgcgg 
30001 agcttaccgc ctgcgtcatt acccagggcc acatccttgg ccaattgcaa gccattaaca 
30061 aagcccgcca agagtttctg ctacgaaagg gacggggggt ttacttggac ccccagtccg 
3 0121 gcgaggagct caacccaatc cccccgccgc cgcagcccta tcagcagccg cgggcccttg 
3 0181 cttcccagga tggcacccaa aaagaagctg cagctgccgc cgccgccacc cacggacgag 
30241 gaggaatact gggacagtca ggcagaggag gttttggacg aggaggagga gatgatggaa 
30301 gactgggaca gcctagacga ggaagcttcc gaggccgaag aggtgtcaga cgaaacaccg 
30361 tcaccctcgg tcgcattccc ctcgccggcg ccccagaaat cggcaaccgt tcccagcatt 
3 0421 gctacaacct ccgctcctca ggcgccgccg gcactgcccg ttcgccgacc caaccgtaga 
30481 tgggacacca ctggaaccag ggccggtaag tctaagcagc cgccgccgtt agcccaagag 
30541 caacaacagc gccaaggcta ccgctcgtgg cgcgtgcaca agaacgccat agttgcttgc 
30601 ttgcaagact gtgggggcaa catctccttc gcccgccgct ttcttctcta ccatcacggc 
30661 gtggccttcc cccgtaacat cctgcattac taccgtcatc tctacagccc ctactgcacc 
30721 ggcggcagcg gcagcaacag cagcggccac gcagaagcaa aggcgaccgg atagcaagac 
30781 tctgacaaag cccaagaaat ccacagcggc ggcagcagca ggaggaggag cactgcgtct 
30841 ggcgcccaac gaacccgtat cgacccgcga gcttagaaac aggatttttc ccactctgta 
30901 tgctatattt caacagagca ggggccaaga acaagagctg aaaataaaaa acaggtctct 
30961 gcgctccctc acccgcagct gcctgtatca caaaagcgaa gatcagcttc ggcgcacgct 
31021 ggaagacgcg gaggctctct tcagcaaata ctgcgcgctg actcttaagg actagtttcg 
31081 cgccctttct caaatttaag cgcgaaaact acgtcatctc cagcggccac acccggcgcc 
31141 agcacctgtc gtcagcgcca ttatgagcaa ggaaattccc acgccctaca tgtggagtta 
31201 ccagccacaa atgggacttg cggctggagc tgcccaagac tactcaaccc gaataaacta 
31261 catgagcgcg ggaccccaca tgatatcccg ggtcaacgga atccgcgccc accgaaaccg 
31321 aattctcctc gaacaggcgg ctattaccac cacacctcgt aataacctta atccccgtag 
31381 ttggcccgct gccctggtgt accaggaaag tcccgctccc accactgtgg tacttcccag 
31441 agacgcccag gccgaagttc agatgactaa ctcaggggcg cagcttgcgg gcggctttcg 
31501 tcacagggtg cggtcgcccg ggcagggtat aactcacctg aaaatcagag ggcgaggtat 
31561 tcagctcaac gacgagtcgg tgagctcctc tcttggtctc cgtccggacg ggacatttca 
31621 gatcggcggc gctggccgct cttcatttac gccccgtcag gcgatcctaa ctctgcagac 
31681 ctcgtcctcg gagccgcgct ccggaggcat tggaactcta caatttattg aggagttcgt 
31741 gccttcggtt tacttcaacc ccttttctgg acctcccggc cactacccgg accagtttat 
31801 tcccaacttt gacgcggtaa aagactcggc ggacggctac gactgaatga ccagtggaga 
31861 ggcagagcaa ctgcgcctga cacacctcga ccactgccgc cgccacaagt gctttgcccg 
31921 cggctccggt gagttttgtt actttgaatt gcccgaagag catatcgagg gcccggcgca 
31981 cggcgtccgg ctcaccaccc aggtagagct tacacgtagc ctgattcggg agtttaccaa 
32041 gcgccccctg ctagtggagc gggagcgggg tccctgtgtt ctgaccgtgg tttgcaactg 
32101 tcctaaccct ggattacatc aagatcttat tccattcaac taacaataaa cacacaataa 
32161 attacttact taaaatcagt cagcaaatct ttgtccagct tattcagcat cacctccttt 
32221 ccctcctccc aactctggta tttcagcagc cttttagctg cgaactttct ccaaagtcta 
32281 aatgggatgt caaattcctc atgttcttgt ccctccgcac ccactatctt catattgttg 
32341 cagatgaaac gcgccagacc gtctgaagac accttcaacc ctgtgtaccc atatgacacg 
32401 gaaaccggcc ctccaactgt gcctttcctt acccctccct ttgtgtcgcc aaatgggttc 
32461 caagaaagtc cccccggagt gctttctttg cgtctttcag aacctttggt tacctcacac 
32521 ggcatgcttg cgctaaaaat gggcagcggc ctgtccctgg atcaggcagg caaccttaca 
32581 tcaaatacaa tcactgtttc tcaaccgcca aaaaaaacaa agtccaatat aactttggaa 
32641 acatccgcgc cccttacagt cagctcaggc gccctaacca tggccacaac ttcgcctttg 
32701 gtggtctctg acaacactct taccacgcaa tcacaagcac cgctaaccgt gcaagactca 
32761 aaacttagca ttgctaccaa agagccactt acagtgttag atggaaaact ggccctgcag 
32821 acatcagccc ccctctctgc cactgataac aacgccctca ctatcactgc ctcacctcct 
32881 cttactactg caaatggtag tctggctgtt accatggaaa acccacttta caacaacaat 
32941 ggaaaacttg ggctcaaaat tggcggtcct ttgcaagtgg ccaccgactc acatgcacta 
33001 acactaggta ctggtcaggg ggttgcagtt cataacaatt tgctacatac aaaagttaca 
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33061 ggcgcaatag ggtttgatac atctggcaac atggaactta aaactggaga tggcctctat 
33121 gtggatagcg ccggtcctaa ccaaaaacta catattaatc taaataccac aaaaggcctt 
33181 gcttttgaca acaccgcaat aacaattaac gctggaaaag ggttggaatt tgaaacagac 
33241 tcctcaaacg gaaatcccat aaaaacaaaa attggatcag gcatacaata taataccaat 
33301 ggagctatgg ttgcaaaact tggaacaggc ctcagttttg acagctccgg agccataaca 
33361 atgggcagca taaacaatga cagacttact ctttggacaa caccagaccc atccccaaat 
33421 tgcagaattg cttcagataa agactgcaag ctaactctgg cgctaacaaa atgtggcagt 
33481 caaattttgg gcactgtttc agctttggca gtatcaggta atatggcctc catcaatgga 
33541 actctaagca gtgtaaactt ggttcttaga tttgatgaca acggagtgct tatgtcaaat 
33601 tcatcactgg acaaacagta ttggaacttt agaaacgggg actccactaa cggtcaacca 
33661 tacacttatg ctgttgggtt tatgccaaac ctaaaagctt acccaaaaac tcaaagtaaa 
33721 actgcaaaaa gtaatattgt tagccaggtg tatcttaatg gtgacaagtc taaaccattg 
33781 cattttacta ttacgctaaa tggaacagat gaaaccaacc aagtaagcaa atactcaata 
33841 tcattcagtt ggtcctggaa cagtggacaa tacactaatg acaaatttgc caccaattcc 
33901 tataccttct cctacattgc ccaggaataa agaatcgtga acctgttgca tgttatgttt 
33 961 caacgtgttt atttttcaat tgcagaaaat ttcaagtcat ttttcattca gtagtatagc 
34021 cccaccacca catagcttat actaatcacc gtaccttaat caaactcaca gaaccctagt 
34081 attcaacctg ccacctccct cccaacacac agagtacaca gtcctttctc cccggctggc 
34141 cttaaacagc atcatatcat gggtaacaga catattctta ggtgttatat tccacacggt 
34201 ctcctgtcga gccaaacgct catcagtgat gttaataaac tccccgggca gctcgcttaa 
34261 gttcatgtcg ctgtccagct gctgagccac aggctgctgt ccaacttgcg gttgctcaac 
34321 gggcggcgaa ggagaagtcc acgcctacat gggggtagag tcataatcgt gcatcaggat 
34381 agggcggtgg tgctgcagca gcgcgcgaat aaactgctgc cgccgccgct ccgtcctgca 
34441 ggaatacaac atggcagtgg tctcctcagc gatgattcgc accgcccgca gcataaggcg 
34501 ccttgtcctc cgggcacagc agcgcaccct gatctcactt aagtcagcac agtaactgca 
34561 gcacagtacc acaatattgt ttaaaatccc acagtgcaag gcgctgtatc caaagctcat 
34621 ggcggggacc acagaaccca cgtggccatc ataccacaag cgcaggtaga ttaagtggcg 
34681 acccctcata aacacgctgg acataaacat tacctctttt ggcatgttgt aattcaccac 
34741 ctcccggtac catataaacc tctgattaaa catggcgcca tccaccacca tcctaaacca 
34801 gctggccaaa acctgcccgc cggctatgca ctgcagggaa ccgggactgg aacaatgaca 
34861 gtggagagcc caggactcgt aaccatggat catcatgctc gtcatgatat caatgttggc 
34921 acaacacagg cacacgtgca tacacttcct caggattaca agctcctccc gcgtcagaac 
34981 catatcccag ggaacaaccc attcctgaat cagcgtaaat cccacactgc agggaagacc 
35041 tcgcacgtaa ctcacgttgt gcattgtcaa agtgttacat tcgggcagca gcggatgatc 
35101 ctccagtatg gtagcgcggg tttctgtctc aaaaggaggt agacgatccc tactgtacgg 
35161 agtgcgccga gacaaccgag atcgtgttgg tcgtagtgtc atgccaaatg gaacgccgga 
35221 cgtagtcata tttcctgaag caaaaccagg tgcgggcgtg acaaacagat ctgcgtctcc 
35281 ggtctcgccg cttagatcgc tctgtgtagt agttgtagta tatccactct ctcaaagcat 
35341 ccaggcgccc cctggcttcg ggttctatgt aaactccttc atgcgccgct gccctgataa 
35401 catccaccac cgcagaataa gccacaccca gccaacctac acattcgttc tgcgagtcac 
35461 acacgggagg agcgggaaga gctggaagaa ccatgttttt ttttttattc caaaagatta 
35521 tccaaaacct caaaatgaag atctattaag tgaacgcgct cccctccggt ggcgtggtca 
35581 aactctacag ccaaagaaca gataatggca tttgtaagat gttgcacaat ggcttccaaa 
35641 aggcaaacgg ccctcacgtc caagtggacg taaaggctaa acccttcagg gtgaatctcc 
35701 tctataaaca ttccagcacc ttcaaccatg cccaaataat tctcatctcg ccaccttctc 
35761 aatatatctc taagcaaatc ccgaatatta agtccggcca ttgtaaaaat ctgctccaga 
35821 gcgccctcca ccttcagcct caagcagcga atcatgattg caaaaattca ggttcctcac 
35881 agacctgtat aagattcaaa agcggaacat taacaaaaat accgcgatcc cgtaggtccc 
35941 ttcgcagggc cagctgaaca taatcgtgca ggtctgcacg gaccagcgcg gccacttccc 
36001 cgccaggaac catgacaaaa gaacccacac tgattatgac acgcatactc ggagctatgc 
36061 taaccagcgt agccccgatg taagcttgtt gcatgggcgg cgatataaaa tgcaaggtgc 
36121 tgctcaaaaa atcaggcaaa gcctcgcgca aaaaagaaag cacatcgtag tcatgctcat 
36181 gcagataaag gcaggtaagc tccggaacca ccacagaaaa agacaccatt tttctctcaa 
36241 acatgtctgc gggtttctgc ataaacacaa aataaaataa caaaaaaaca tttaaacatt 
36301 agaagcctgt cttacaacag gaaaaacaac ccttataagc ataagacgga ctacggccat 
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363 61 gccggcgtga ccgtaaaaaa actggtcacc gtgattaaaa agcaccaccg acagctcctc 
36421 ggtcatgtcc ggagtcataa tgtaagactc ggtaaacaca tcaggttgat tcacatcggt 
36481 cagtgctaaa aagcgaccga aatagcccgg gggaatacat acccgcaggc gtagagacaa 
36541 cattacagcc cccataggag gtataacaaa attaatagga gagaaaaaca cataaacacc 
36601 tgaaaaaccc tcctgcctag gcaaaatagc accctcccgc tccagaacaa catacagcgc 
36661 ttccacagcg gcagccataa cagtcagcct taccagtaaa aaagaaaacc tattaaaaaa 
36721 acaccactcg acacggcacc agctcaatca gtcacagtgt aaaaaagggc caagtgcaga 
367 81 gcgagtatat ataggactaa aaaatgacgt aacggttaaa gtccacaaaa aacacccaga 
36841 aaaccgcacg cgaacctacg cccagaaacg aaagccaaaa aacccacaac ttcctcaaat 
36901 cgtcacttcc gttttcccac gttacgtcac ttcccatttt aagaaaacta caattcccaa 
36961 cacatacaag ttactccgcc ctaaaaccta cgtcacccgc cccgttccca cgccccgcgc 
37021 cacgtcacaa actccacccc ctcattatca tattggcttc aatccaaaat aaggtatatt 
37081 attgatgatg 
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10 30 50 

ATGGCGCCCATCACGGCCTACTCCCAACAGACGCGGGGCCTACTTGGTTGCATCATCACT 

+ + + + + + 

MetAlaProIleThrAlaTyrSerGlnGlnThrArgGlyLeuLeuGlyCysIlelleThr 

10 20 



70 90 110 

AGCCTTACAGGCCGGGACAAGAACCAGGTCGAGGGAGAGGTTCAGGTGGTTTCCACCGCA 



SerLeuThrGlyArgAspLysAsnGlnValGluGlyGluValGlnValValSerThrAla 

30 40 



130 150 170 

ACACAATCCTTCCTGGCGACCTGCGTCAACGGCGTGTGTTGGACCGTTTACCATGGTGCT 



ThrGlnSerPheLeuAlaThrCysValAsnGlyValCysTrpThrValTyrHisGlyAla 

50 60 



190 210 230 

GGCTCAAAGACCTTAGCCGGCCCAAAGGGGCCAATCACCCAGATGTACACTAATGTGGAC 



GlySerLysThrLeuAlaGlyProLysGlyProIleThrGlnMetTyrThrAsnValAsp 

70 80 

250 270 290 

CAGGACCTCGTCGGCTGGCAGGCGCCCCCCGGGGCGCGTTCCTTGACACCATGCACCTGT 



GlnAspLeuValGlyTrpGlnAlaProProGlyAlaArgSerLeuThrProCysThrCys 

90 100 



310 330 350 

GGCAGCTCAGACCTTTACTTGGTCACGAGACATGCTGACGTCATTCCGGTGCGCCGGCGG 



GlySerSerAspLeuTyrLeuValThrArgHisAlaAspVallleProValArgArgArg 

110 120 



370 390 410 

GGCGACAGTAGGGGGAGCCTGCTCTCCCCCAGGCCTGTCTCCTACTTGAAGGGCTCTTCG 



GlyAspSerArgGlySerLeuLeuSerProArgProValSerTyrLeuLysGlySerSer 

130 140 
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430 450 470 

GGTGGTCCACTGCTCTGCCCTTCGGGGCACGCTGTGGGCATCTTCCGGGCTGCCGTATGC 

+ + + + + + 

GlyGlyProLeuLeuCysProSerGlyHisAlaValGlyllePheArgAlaAlaValCys 

150 160 



490 510 530 

AC C CGGGGGGTTGC GAAGGCGGTGGACTTTGTGCCCGTAGAGTCC ATGGAAACTACTATG 



ThrArgGlyValAlaLysAlaValAspPheValProValGluSerMetGluThrThrMet 

170 180 

550 570 590 

CGGTCTCCGGTCTTCACGGACAACTCATCCCCCCCGGCCGTACCGCAGTCATTTCAAGTG 



ArgSerProValPheThrAspAsnSerSerProProAlaValProGlnSerPheGlnVal 

190 200 



610 630 650 

GCCCACCTACACGCTCCCACTGGCAGCGGCAAGAGTACTAAAGTGCCGGCTGCATATGCA 



AlaHisLeuHisAlaProThrGlySerGlyLysSerThrLysValProAlaAlaTyrAla 

210 220 

670 690 710 

GCCCAAGGGTACAAGGTGCTCGTCCTCAATCCGTCCGTTGCCGCTACCTTAGGGTTTGGG 



AlaGlnGlyTyrLysValLeuValLeuAsnProSerValAlaAlaThrLeuGlyPheGly 

230 240 



730 750 770 

GCGTATATGTCTAAGGCACACGGTATTGACCCCAACATCAGAACTGGGGTAAGGACCATT 



AlaTyrMetSerLysAlaHisGlylleAspProAsnlleArgThrGlyValArgThrlle 

250 260 



790 810 830 

ACCACAGGCGCCCCCGTCACATACTCTACCTATGGCAAGTTTCTTGCCGATGGTGGTTGC 



ThrThrGlyAlaProValThrTyrSerThrTyrGlyLysPheLeuAlaAspGlyGlyCys 

270 280 
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850 870 890 

TCTGGGGGCGCTTATGACATCATAATATGTGATGAGTGCCATTCAACTGACTCGACTACA 

+ - + + + + + 

SerGlyGlyAlaTyrAspIlellelleCysAspGluCysHisSerThrAspSerThrThr 

290 300 

910 930 950 

ATCTTGGGCATCGGCACAGTCCTGGACCAAGCGGAGACGGCTGGAGCGCGGCTTGTCGTG 

+ + + + + + 

IleLeuGlylleGlyThrValLeuAspGlnAlaGluThrAlaGlyAlaArgLeuValVal 

310 320 

970 990 1010 

CTCGCCACCGCTACGCCTCCGGGATCGGTCACCGTGCCACACCCAAACATCGAGGAGGTG 

+ + + + + + 

LeuAlaThrAlaThrProProGlySerValThrValProHisProAsnlleGluGluVal 

330 340 



1030 1050 1070 

GCCCTGTCTAATACTGGAGAGATCCCCTTCTATGGCAAAGCCATCCCCATTGAAGCCATC 



AlaLeuSerAsnThrGlyGluIleProPheTyrGlyLysAlalleProIleGluAlalle 

350 360 

1090 1110 H30 

AGGGGGGGAAGGCATCTCATTTTCTGTCATTCCAAGAAGAAGTGCGACGAGCTCGCCGCA 



ArgGlyGlyArgHisLeuIlePheCysHisSerLysLysLysCysAspGluLeuAlaAla 

370 380 



1150 H70 1190 

AAGCTGTCAGGCCTCGGAATCAACGCTGTGGCGTATTACCGGGGGCTCGATGTGTCCGTC 



LysLeuSerGlyLeuGlylleAsnAlaValAlaTyrTyrArgGlyLeuAspValSerVal 

390 400 



1210 1230 1250 

ATACCAACTATCGGAGACGTCGTTGTCGTGGCAACAGACGCTCTGATGACGGGCTATACG 



IleProThrlleGlyAspValValValValAlaThrAspAlaLeuMetThrGlyTyrThr 

410 420 
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1270 1290 1310 

GGCGACTTTGACTCAGTGATCGACTGTAACACATGTGTCACCCAGACAGTCGACTTCAGC 

+ + + H + + 

GlyAspPheAspSerVallleAspCysAsnThrCysValThrGlnThrValAspPheSer 

430 440 

1330 1350 1370 

TTGGATCCCACCTTCACCATTGAGACGACGACCGTGCCTCAAGACGCAGTGTCGCGCTCG 

+ + + + + + 

LeuAspProThrPheThrlleGluThrThrThrValProGlx^spAlaValSerArgSer 

450 460 

1390 1410 1430 

CAGCGGCGGGGTAGGACTGGCAGGGGTAGGAGAGGCATCTACAGGTTTGTGACTCCGGGA 

+ + + + + + 

GlnArgArgGlyArgThrGlyArgGlyArgArgGlylleTyrArgPheValThrProGly 

470 480 

1450 1470 1490 

GAACGGCCCTCGGGCATGTTCGATTCCTCGGTCCTGTGTGAGTGCTATGACGCGGGCTGT 

+ + + + + + 

GluArgProSerGlyMetPheAspSerSerValLeuCysGluCysTyrAspAlaGlyCys 

490 500 

1510 1530 1550 

GCTTGGTACGAGCTCACCCCCGCCGAGACCTCGGTTAGGTTGCGGGCCTACCTGAACACA 

+ + + + + + 

AlaTrpTyrGluLeuThrProAlaGluThrSerValArgLeuArgAlaTyrLeuAsnThr 

510 520 

1570 1590 1610 

CCAGGGTTGCCCGTTTGCCAGGACCACCTGGAGTTCTGGGAGAGTGTCTTCACAGGCCTC 

+ + + + + + 

ProGlyLeuProValCysGlnAspHisLeuGluPheTrpGluSerValPheThrGlyLeu 

530 540 

1630 1650 1670 

ACCCACATAGATGCACACTTCTTGTCCCAGACCAAGCAGGCAGGAGACAACTTCCCCTAC 

+ + + + + + 

ThrHisIleAspAlaHisPheLeuSerGlnThrLysGlnAlaGlyAspAsnPheProTyr 

550 560 
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1690 1710 1730 

CTGGTAGCATACCAAGCCACGGTGTGCGCCAGGGCTCAGGCCCCACCTCCATCATGGGAT 



- + 



LeuValAlaTyrGlriAlaThrValCysAlaArgAlaGlnAlaProProProSerTrpAsp 

570 580 

1750 1770 1790 

CAAATGTGGAAGTGTCTCATACGGCTGAAACCTACGCTGCACGGGCCAACACCCTTGCTG 

+ + + + + + 

GlnMetTrpLysCysLeuIleArgLeuLysProThrLeuHisGlyProThrProLeuLeu 

590 600 

1810 1830 1850 

TACAGGCTGGGAGCCGTCCAAAATGAGGTCACCCTCACCCACCCCATAACCAAATACATC 

+ + + + + + 

TyrArgLeuGlyAlaValGlnAsnGluValThrLeuThrHisProIleThrLysTyrlle 

610 620 

1870 1890 1910 

ATGGCATGCATGTCGGCTGACCTGGAGGTCGTCACTAGCACCTGGGTGCTGGTGGGCGGA 

+ + + + ■+• + 

MetAlaCysMetSerAlaAspLeuGluValValThrSerThrTrpValLeuValGlyGly 

630 640 

1930 1950 1970 

GTCCTTGCAGCTCTGGCCGCGTATTGCCTGACAACAGGCAGTGTGGTCATTGTGGGTAGG 

+ + + + + + 

ValLeuAlaAlaLeuAlaAlaTyrCysLeuThrThrGlySerValVallleValGlyArg 

650 660 

1990 2010 2030 

ATTATCTTGTCCGGGAGGCCGGCTATTGTTCCCGACAGGGAGTTTCTCTACCAGGAGTTC 

+ + + + + + 

IlelleLeuSerGlyArgProAlalleValProAspArgGluPheLeuTyrGlnGluPhe 

670 680 

2050 2070 2090 

GATGAAATGGAAGAGTGCGCCTCGCACCTCCCTTACATCGAGCAGGGAATGCAGCTCGCC 

+ _ + + + + + 

AspGluMetGluGluCysAlaSerHisLeuProTyrlleGluGlnGlyMetGlnLeuAla 

690 700 
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2110 2130 2150 

GAGCAATTCAAGCAGAAAGCGCTCGGGTTACTGCAAACAGCCACCAAACAAGCGGAGGCT 

+ + + + + + 

GluGlnPheLysGlnLysAlaLeuGlyLeuLeuGlnThrAlaThrLysGlnAlaGluAla 

710 720 



2170 2190 2210 

GCTGCTCCCGTGGTGGAGTCCAAGTGGCGAGCCCTTGAGACATTCTGGGCGAAGCACATG 



AlaAlaProValValGluSerLysTrpArgAlaLeuGluThrPheTrpAlaLysHisMet 

730 740 



2230 2250 2270 

TGGAATTTCATC AGCGGGATAC AGTACTTAGCAGGC TTATCC ACTCTGCC TGGGAACCCC 



TrpAsnPhelleSerGlylleGlnTyrLeuAlaGlyLeuSerThrLeuProGlyAsnPro 

750 760 



2290 2310 2330 

GCAATAGCATCATTGATGGCATTCACAGCCTCTATCACCAGCCCGCTCACCACCCAAAGT 



AlalleAlaSerLeuMetAlaPheThrAlaSerlleThrSerProLeuThrThrGlnSer 

770 780 



2350 2370 2390 

ACCCTCCTGTTTAACATCTTGGGGGGGTGGGTGGCTGCCCAACTCGCCCCCCCCAGCGCC 



ThrLeuLeuPheAsnlleLeuGlyGlyTrpValAlaAlaGlnLeuAlaProProSerAla 

790 800 

2410 2430 2450 

GCTTCGGCTTTCGTGGGCGCCGGCATCGCCGGTGCGGCTGTTGGCAGCATAGGCCTTGGG 



AlaSerAlaPheValGlyAlaGlylleAlaGlyAlaAlaValGlySerlleGlyLeuGly 

810 820 



2470 2490 2510 

AAGGTGCTTGTGGACATTCTGGCGGGTTATGGAGCAGGAGTGGCCGGCGCGCTCGTGGCC 



LysValLeuValAspIleLeuAlaGlyTyrGlyAlaGlyValAlaGlyAlaLeuValAla 

830 840 
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2530 2550 2570 

TTCAAGGTCATGAGCGGCGAGATGCCCTCCACCGAGGACCTGGTCAATCTACTTCCTGCC 



PheLysValMetSerGlyGluMetProSerThrGluAspLeuValAsnLeuLeuProAla 

850 860 



2590 2610 2630 

ATCCTCTCTCCTGGCGCCCTGGTCGTCGGGGTCGTGTGTGCAGCAATACTGCGTCGACAC 



IleLeuSerProGlyAlaLeuValValGlyValValCysAlaAlalleLeuArgArgHis 

870 880 



2650 2670 2690 

GTGGGTCCGGGAGAGGGGGCTGTGCAGTGGATGAACCGGCTGATAGCGTTCGCCTCGCGG 



ValGlyProGlyGluGlyAlaValGlnTrpMetAsnArgLeuIleAlaPheAlaSerArg 

890 900 



2710 2730 2750 

GGTAATCATGTTTCCCCCACGCACTATGTGCCTGAGAGCGACGCCGCAGCGCGTGTTACT 



GlyAsnHisValSerProThrHisTyrValProGluSerAspAlaAlaAlaArgValThr 

910 920 



2770 2790 2810 

CAGATCCTCTCCAGCCTTACCATCACTCAGCTGCTGAAAAGGCTCCACCAGTGGATTAAT 



Gl nl 1 eLeuSer SerLeuThr 1 1 eThrGlnLeuLeuLy s ArgLeuHi sGlnTrpI 1 eAsn 

930 940 



2830 2850 2870 

GAAGACTGCTCCACACCGTGTTCCGGCTCGTGGCTAAGGGATGTTTGGGACTGGATATGC 



GluAspCysSerThrProCysSerGlySerTrpLeuArgAspValTrpAspTrpIleCys 

950 960 



2890 2910 2930 

ACGGTGTTGACTGACTTCAAGACCTGGCTCCAGTCCAAGCTCCTGCCGCAGCTACCGGGA 



ThrValLeuThrAspPheLysThrTrpLeuGlnSerLysLeuLeuProGlnLeuProGly 

970 980 
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2950 2970 2990 

GTCCCTTTTTTCTCGTGCCAACGCGGGTACAAGGGAGTCTGGCGGGGAGACGGCATCATG 



ValProPhePheSerCysGlnArgGlyTyrLysGlyValTrpArgGlyAspGlylleMet 

990 1000 

3010 3030 3050 

CAAACCACCTGCCCATGTGGAGCACAGATCACCGGACATGTCAAAAACGGTTCCATGAGG 



GlnThrThrCysProCysGlyAlaGlnlleThrGlyHisValLysAsnGlySerMetArg 

1010 1020 



3070 3090 3110 

ATCGTCGGGCCTAAGACCTGCAGCAACACGTGGCATGGAACATTCCCCATCAACGCATAC 

+ + + + + + 

IleValGlyProLysThrCysSerAsnThrTrpHisGlyThrPheProIleAsnAlaTyr 

1030 1040 



3130 3150 3170 

ACCACGGGCCCCTGCACACCCTCTCCAGCGCCAAACTATTCTAGGGCGCTGTGGCGGGTG 

+ + + + + + 

ThrThrGlyProCysThrProSerProAlaProAsnTyrSerArgAlaLeuTrpArgVal 

1050 1060 

3190 3210 3230 

GCCGCTGAGGAGTACGTGGAGGTCACGCGGGTGGGGGATTTCCACTACGTGACGGGCATG 



AlaAlaGluGluTyrValGluValThrArgValGlyAspPheHisTyrValThrGlyMet 

1070 1080 

3250 3270 3290 

ACC ACTGAC AACGTAAAGTGCCCATGCCAGGTTCC GGC TCCTGAATTCTTCACGG AGGTG 

+ + + + + + 

ThrThrAspAsnValLysCysProCysGlnValProAlaProGluPhePheThrGluVal 

1090 1100 



3310 3330 3350 

GACGGAGTGCGGTTGCACAGGTACGCTCCGGCGTGCAGGCCTCTCCTACGGGAGGAGGTT 

+ + + + + + 

AspGlyValArgLeuHisArgTyrAlaProAlaCysArgProLeuLeuArgGluGluVal 

1110 1120 
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3370 3390 3410 

ACATTCCAGGTCGGGCTCAACCAATACCTGGTTGGGTCACAGCTACCATGCGAGCCCGAA 



ThrPheGlnValGlyLeuAsnGlnTyrLeuValGlySerGlnLeuProCysGluProGlu 

1130 1140 



3430 3450 3470 

CCGGATGTAGCAGTGCTCACTTCCATGCTCACCGACCCCTCCCACATCACAGCAGAAACG 



ProAspValAlaValLeuThrSerMetLeuThrAspProSerHisIleThrAlaGluThr 

1150 H60 

3490 3510 3530 

GCTAAGCGTAGGTTGGCCAGGGGGTCTCCCCCCTCCTTGGCCAGCTCTTCAGCTAGCCAG 



AlaLysArgArgLeuAlaArgGlySerProProSerLeuAlaSerSerSerAlaSerGln 

1170 H80 



3550 3570 3590 

TTGTCTGCGCCTTCCTTGAAGGCGACATGCACTACCCACCATGTCTCTCCGGACGCTGAC 



LeuSerAlaProSerLeuLysAlaThrCysThrThrHisHisValSerProAspAlaAsp 

1190 1200 



3610 3630 3650 

CTCATCGAGGCCAACCTCCTGTGGCGGCAGGAGATGGGCGGGAACATCACCCGCGTGGAG 



LeuIleGluAlaAsnLeuLeuTrpArgGlnGluMetGlyGlyAsnlleThrArgValGlu 

1210 1220 



3670 3690 3710 

TCGGAGAACAAGGTGGTAGTCCTGGACTCTTTCGACCCGCTTCGAGCGGAGGAGGATGAG 



SerGluAsnLysValValValLeuAspSerPheAspProLeuArgAlaGluGluAspGlu 

1230 1240 



3730 3750 3770 

AGGGAAGTATCCGTTCCGGCGGAGATCCTGCGGAAATCCAAGAAGTTCCCCGCAGCGATG 



ArgGluValSerValProAlaGluIleLeuArgLysSerLysLysPheProAlaAlaMet 

1250 1260 
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3790 3810 3830 

CCCATCTGGGCGCGCCCGGATTACAACCCTCCACTGTTAGAGTCCTGGAAGGACCCGGAC 

+ + + + + : + 

ProIleTrpAlaArgProAspTyrAsnProProLeuLeuGluSerTrpLysAspProAsp 

1270 1280 



3850 3870 3890 

TACGTCCCTCCGGTGGTGCACGGGTGCCCGTTGCCACCTATCAAGGCCCCTCCAATACCA 



TyrValProProValValHisGlyCysProLeuProProIleLysAlaProProIlePro 

1290 1300 



3910 3930 3950 

CCTCCACGGAGAAAGAGGACGGTTGTCCTAACAGAGTCCTCCGTGTCTTCTGCCTTAGCG 

+ + + + + + 

ProProArgArgLysArgThrValValLeuThrGluSerSerValSerSerAlaLeuAla 

1310 1320 



3970 3990 4010 

GAGCTCGCTACTAAGACCTTCGGCAGCTCCGAATCATCGGCCGTCGACAGCGGCACGGCG 



GluLeuAlaThrLysThrPheGlySerSerGluSerSerAlaValAspSerGlyThrAla 

1330 1340 



4030 4050 4070 

ACCGCCCTTCCTGACCAGGCCTCCGACGACGGTGACAAAGGATCCGACGTTGAGTCGTAC 



ThrAlaLeuProAspGlnAlaSerAspAspGlyAspLysGlySerAspValGluSerTyr 

1350 1360 



4090 4110 4130 

TCCTCCATGCCCCCCCTTGAGGGGGAACCGGGGGACCCCGATCTCAGTGACGGGTCTTGG 



SerSerMetProProLeuGluGlyGluProGlyAspProAspLeuSerAspGlySerTrp 

1370 1380 



4150 4170 4190 

TCTACCGTGAGCGAGGAAGCTAGTGAGGATGTCGTCTGCTGCTCAATGTCCTACACATGG 



SerThrValSerGluGluAlaSerGluAspValValCysCysSerMetSerTyrThrTrp 

1390 1400 
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4210 4230 4250 

ACAGGCGCCTTGATCACGCCATGCGCTGCGGAGGAAAGCAAGCTGCCCATCAACGCGTTG 

+ + + + + + 

ThrGlyAlaLeuIleThrProCysAlaAlaGluGluSerLysLeuProIleAsnAlaLeu 

1410 1420 

4270 4290 4310 

AGCAACTCTTTGCTGCGCCACCATAACATGGTTTATGCCACAACATCTCGCAGCGCAGGC 

. + + + + 

SerAsnSerLeuLeuArgHisHisAsnMetValTyrAlaThrThrSerArgSerAlaGly 

1430 1440 

4330 4350 4370 

CTGCGGCAGAAGAAGGTCACCTTTGACAGACTGCAAGTCCTGGACGACCACTACCGGGAC 

+ + + + + + 

LeuArgGlnLysLysValThrPheAspArgLeuGlnValLeuAspAspHisTyrArgAsp 

1450 1460 

4390 4410 4430 

GTGC TC AAGGAGATGAAGGCGAAGGCGTCCACAGTTAAGGCTAAAC tcctatccgtagag 

+ + + + + + 

ValLeuLysGluMetLysAlaLysAlaSerThrValLysAlaLysLeuLeuSerValGlu 

1470 1480 

4450 4470 4490 

GAAGCCTGCAAGCTGACGCCCCCACATTCGGCCAAATCCAAGTTTGGCTATGGGGCAAAG 

+ + + + + + 

GluAlaCysLysLeuThrProProHisSerAlaLysSerLysPheGlyTyrGlyAlaLys 

1490 1500 

4510 4530 4550 

GACGTCCGGAACCTATCC AGCAAGGCC GTTAAC C AC ATCC ACTCC GTGTGGAAGGACTTG 

+ + + + + 

AspValArgAsnLeuSerSerLysAlaValAsnHisIleHisSerValTrpLysAspLeu 

1510 1520 

4570 4590 4610 

CTGGAAGACACTGTGACACCAATTGACACCACCATCATGGCAAAAAATGAGGTTTTCTGT 

+ + + + + + 

LeuGluAspThrValThrProIleAspThrThrlleMetAlaLysAsnGluValPheCys 

1530 1540 
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4630 4650 4670 

GTCCAACCAGAGAAAGGAGGCCGTAAGCCAGCCCGCCTTATCGTATTCCCAGATCTGGGA 

+ + + + + + 

ValGlnProGluLysGlyGlyArgLysProAlaArgLeulleValPheProAspLeuGly 

1550 1560 

4690 4710 4730 

GTCCGTGTATGCGAGAAGATGGCCCTCTATGATGTGGTCTCCACCCTTCCTCAGGTCGTG 

+ + + + + + 

ValArgValCysGluLysMetAlaLeuTyrAspValValSerThrLeuProGlnValVal 

1570 1580 

4750 4770 4790 

ATGGGCTCCTCATACGGATTCCAGTACTCTCCTGGGCAGCGAGTCGAGTTCCTGGTGAAT 

+ + 4 + + + 

MetGlySerSerTyrGlyPheGlnTyrSerProGlyGlnArgValGluPheLeuValAsn 

1590 1600 

4810 4830 4850 

ACCTGGAAATCAAAGAAAAACCCCATGGGCTTTTCATATGACACTCGCTGTTTCGACTCA 

+ + + + + + 

ThrTrpLysSerLysLysAsnProMetGlyPheSerTyrAspThrArgCysPheAspSer 

.1610 1620 

4870 4890 4910 

ACGGTCACCGAGAACGACATCCGTGTTGAGGAGTCAATTTACCAATGTTGTGACTTGGCC 

+ + + + + + 

ThrValThrGluAsnAspIleArgValGluGluSerlleTyrGlnCysCysAspLeuAla 

1630 1640 

4930 4950 4970 

CCCGAAGCCAGACAGGCCATAAAATCGCTCACAGAGCGGCTTTATATCGGGGGTCCTCTG 

+ + + + + + 

ProGluAlaArgGlnAlalleLysSerLeuThrGluArgLeuTyrlleGlyGlyProLeu 

1650 1660 

4990 5010 5030 

ACTAATTCAAAAGGGCAGAACTGCGGTTATCGCCGGTGCCGCGCGAGCGGCGTGCTGACG 

+ + + + + + 

ThrAsnSerLysGlyGlnAsnCysGlyTyrArgArgCysArgAlaSerGlyValLeuThr 

1670 1680 
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5050 5070 5090 

ACTAGCTGCGGTAACACCCTCACATGTTACTTGAAGGCCTCTGCAGCCTGTCGAGCTGCG 



ThrSerCysGlyAsnThrLeuThrCysTyrLeuLysAlaSerAlaAlaCysArgAlaAla 

1690 1700 



5110 5130 5150 

AAGCTCCAGGACTGCACGATGCTCGTGAACGGAGACGACCTTGTCGTTATCTGTGAAAGC 



LysLeuGlnAspCysThrMetLeuValAsnGlyAspAspLeuValVallleCysGluSer 

1710 1720 



5170 5190 5210 

GCGGGAACCCAAGAGGACGCGGCGAGCCTACGAGTCTTCACGGAGGCTATGACTAGGTAC 



AlaGlyThrGlnGluAspAlaAlaSerLeuArgValPheThrGluAlaMetThrArgTyr 

1730 1740 



5230 5250 5270 

TCTGCCCCCCCCGGGGACCCGCCCCAACCAGAATACGACTTGGAGCTGATAACATCATGT 



SerAlaProProGlyAspProProGlnProGluTyrAspLeuGluLeuIleThrSerCys 

1750 1760 



5290 5310 5330 

TC CTCC AATGTGTCGGTC GC CC ACGATGC ATC AGGC AAAAGGGTGTACTAC CTC ACCCGT 



SerSerAsnValSerValAlaHisAspAlaSerGlyLysArgValTyrTyrLeuThrArg 

1770 1780 



5350 5370 5390 

GATCCCACCACCCCCCTCGCACGGGCTGCGTGGGAAACAGCTAGACACACTCCAGTTAAC 



AspProThrThrProLeuAlaArgAlaAlaTrpGluThrAlaArgHisThrProValAsn 

1790 1800 



5410 5430 5450 

TCCTGGCTAGGCAACATTATCATGTATGCGCCCACTTTGTGGGCAAGGATGATTCTGATG 



SerTrpLeuGlyAsnllelleMetTyrAlaProThrLeuTrpAlaArgMetlleLeuMet 

1810 1820 
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5470 5490 5510 

ACTCACTTCTTCTCCATCCTTCTAGCACAGGAGCAACTTGAAAAAGCCCTGGACTGCCAG 

+ + + + + + 

ThrHisPhePheSerlleLeuLeuAlaGlnGluGlnLeuGluLysAlaLeuAspCysGln 

1830 1840 



5530 5550 5570 

ATCTACGGGGCCTGTTACTCCATTGAGC C AC TTGACCTACCTC AGATC ATTGAACGACTC 



IleTyrGlyAlaCysTyrSerlleGluProLeuAspLeuProGlnllelleGluArgLeu 

1850 1860 

5590 5610 5630 

CATGGCCTTAGCGCATTTTCACTCCATAGTTACTCTCCAGGTGAGATCAATAGGGTGGCT 



HisGlyLeuSerAlaPheSerLeuHisSerTyrSerProGlyGluIleAsnArgValAla 

1870 1880 



5650 5670 5690 

TCATGCCTCAGGAAACTTGGGGTACCACCCTTGCGAGTCTGGAGACATC GGGCCAGGAGC 

+ + 4- + + + 

SerCysLeuArgLysLeuGlyValProProLeuArgValTrpArgHisArgAlaArgSer 

1890 1900 

5710 5730 5750 

GTCCGCGCTAGGCTACTGTCCCAGGGGGGGAGGGCCGCCACTTGTGGCAAGTACCTCTTC 

+ + + + + + 

ValArgAlaArgLeuLeuSerGlnGlyGlyArgAlaAlaThrCysGlyLysTyrLeuPhe 

1910 1920 

5770 5790 5810 

AACTGGGCAGTGAAGACCAAACTCAAACTCACTCCAATCCCGGCTGCGTCCCAGCTGGAC 

+ + + + + + 

AsnTrpAlaValLysThrLysLeuLysLeuThrProIleProAlaAlaSerGlnLeuAsp 

1930 1940 



5830 5850 5870 

TTGTCCGGCTGGTTCGTTGCTGGTTACAGCGGGGGAGACATATATCACAGCCTGTCTCGT 



LeuSerGlyTrpPheValAlaGlyTyrSerGlyGlyAspIleTyrHisSerLeuSerArg 

1950 I960 
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5890 5910 5930 

GCCCGACCCCGCTGGTTCATGCTGTGCCTACTCCTACTTTCTGTAGGGGTAGGCATCTAC 



AlaArgProArgTrpPheMetLeuCysLeuLeuLeuLeuSerValGlyValGLylleTyr 

1970 1980 

5950 5955 
CTGCTCCCCAACCGA (SEQ. ID. NO. 5) 



LeuLeuProAsnArg (SEQ . ID. NO. 6) 
1985 
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1 


TCGCGCGTTT 


CGGTGATGAC 


GGTGAAAACC 


TCTGACACAT 


GCAGCTCCCG 


51 


GAGACGGTCA 


CAGCTTGTCT 


GTAAGCGGAT 


GCCGGGAGCA 


GACAAGCCCG 


101 


TCAGGGCGCG 


TCAGCGGGTG 


TTGGCGGGTG 


TCGGGGCTGG 


CTTAACTATG 


151 


CGGCATCAGA 


GCAGATTGTA CTGAGAGTGC 


ACCATATGCG 


GTGTGAAATA 


201 


CCGCACAGAT 


GCGTAAGGAG 


AAAATACCGC 


ATCAGATTGG 


CTATTGGCCA 


251 


TTGCATACGT 


TGTATCCATA 


TCATAATATG 


TACATTTATA 


TTGGCTCATG 


301 


TCCAACATTA 


CCGCCATGTT 


GACATTGATT 


ATTGACTAGT 


TATTAATAGT 


351 


AATCAATTAC 


GGGGTCATTA 


GTTCATAGCC 


CATATATGGA 


GTTCCGCGTT 


401 


ACATAACTTA 


CGGTAAATGG 


CCCGCCTGGC 


TGACCGCCCA ACGACCCCCG 


451 


CCCATTGACG 


TCAATAATGA 


CGTATGTTCC 


CATAGTAACG 


CCAATAGGGA 


501 


CTTTCCATTG 


ACGTCAATGG 


GTGGAGTATT 


TACGGTAAAC 


TGCCCACTTG 


551 


GCAGTACATC 


AAGTGTATCA TATGCCAAGT 


ACGCCCCCTA 


TTGACGTCAA 


601 


TGACGGTAAA 


TGGCCCGCCT 


GGCATTATGC 


CCAGTACATG 


ACCTTATGGG 


651 


ACTTTCCTAC 


TTGGCAGTAC 


ATCTACGTAT 


TAGTCATCGC 


TATTACCATG 


701 


GTGATGCGGT 


TTTGGCAGTA 


CATCAATGGG 


CGTGGATAGC 


GGTTTGACTC 


751 


ACGGGGATTT 


CCAAGTCTCC 


ACCCCATTGA 


CGTCAATGGG 


AGTTTGTTTT 


801 


GGCACCAAAA 


TCAACGGGAC 


TTTCCAAAAT 


GTCGTAACAA 


CTCCGCCCCA 


851 


TTGACGCAAA 


TGGGCGGTAG 


GCGTGTACGG 


TGGGAGGTCT 


ATATAAGCAG 


901 


AGCTCGTTTA 


GTGAACCGTC 


AGATCGCCTG 


GAGACGCCAT 


CCACGCTGTT 


951 


TTGACCTCCA 


TAGAAGACAC 


CGGGACCGAT 


CCAGCCTCCG 


CGGCCGGGAA 


1001 


CGGTGCATTG 


GAACGCGGAT 


TCCCCGTGCC 


AAGAGTGACG 


TAAGTACCGC 


1051 


CTATAGACTC 


TATAGGCACA 


CCCCTTTGGC 


TCTTATGCAT 


GCTATACTGT 


1101 


TTTTGGCTTG 


GGGCCTATAC 


ACCCCCGCTT 


CCTTATGCTA 


TAGGTGATGG 


1151 


TATAGCTTAG 


CCTATAGGTG 


TGGGTTATTG 


ACCATTATTG 


ACCACTCCCC 


1201 


TATTGGTGAC 


GATACTTTCC ATTACTAATC 


CATAACATGG 


CTCTTTGCCA 


1251 


CAACTATCTC 


TATTGGCTAT 


ATGCCAATAC 


TCTGTCCTTC 


AGAGACTGAC 


1301 


ACGGACTCTG 


TATTTTTACA 


GGATGGGGTC 


CCATTTATTA 


TTTACAAATT 


1351 


CACATATACA 


ACAACGCCGT 


CCCCCGTGCC 


CGCAGTTTTT 


ATTAAACATA 


1401 


GCGTGGGATC 


TCCACGCGAA TCTCGGGTAC 


GTGTTCCGGA 


CATGGGCTCT 


1451 


TCTCCGGTAG 


CGGCGGAGCT 


TCCACATCCG 


AGCCCTGGTC 


CCATGCCTCC 


1501 


AGCGGCTCAT 


GGTCGCTCGG 


CAGCTCCTTG 


CTCCTAACAG 


TGGAGGCCAG 


1551 


ACTTAGGCAC 


AGCACAATGC 


CCACCACCAC 


CAGTGTGCCG 


CACAAGGCCG 


1601 


TGGCGGTAGG 


GTATGTGTCT 


GAAAATGAGC 


GTGGAGATTG 


GGCTCGCACG 


1651 


GCTGACGCAG ATGGAAGACT 


TAAGGCAGCG 


GCAGAAGAAG 


ATGCAGGCAG 


1701 


CTGAGTTGTT 


GTATTCTGAT 


AAGAGTCAGA 


GGTAACTCCC 


GTTGCGGTGC 


1751 


TGTTAACGGT 


GGAGGGCAGT 


GTAGTCTGAG 


CAGTACTCGT 


TGCTGCCGCG 


1801 


CGCGCCACCA 


GACATAATAG 


CTGACAGACT 


AACAGACTGT 


TCCTTTCCAT 


1851 


GGGTCTTTTC TGCAGTCACC GTCCTTAGAT CTAGGTACCA GATATCAGAA 


1901 


TTCAGTCGAC 


AGCGGCCGCG 


ATCTGCTGTG 


CCTTCTAGTT 


GCCAGCCATC 


1951 


TGTTGTTTGC 


CCCTCCCCCG 


TGCCTTCCTT 


GACCCTGGAA 


GGTGCCACTC 


2001 


CCACTGTCCT 


TTCCTAATAA AATGAGGAAA 


TTGCATCGCA 


TTGTCTGAGT 


2051 


AGGTGTCATT 


CTATTCTGGG 


GGGTGGGGTG 


GGGCAGGACA 


GCAAGGGGGA 
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2101 GGATTGGGAA GACAATAGCA GGCATGCTGG GGATGCGGTG GGCTCTATGG 

2151 CCGCTGCGGC CAGGTGCTGA AGAATTGACC CGGTTCCTCC TGGGCCAGAA 

2201 AGAAGCAGGC ACATCCCCTT CTCTGTGACA CACCCTGTCC ACGCCCCTGG 

2251 TTCTTAGTTC CAGCCCCACT CATAGGACAC TCATAGCTCA GGAGGGCTCC 

2301 GCCTTCAATC CCACCCGCTA AAGTACTTGG AGCGGTCTCT CCCTCCCTCA 

2351 TCAGCCCACC AAACCAAACC TAGCCTCCAA GAGTGGGAAG AAATTAAAGC 

2401 AAGATAGGCT ATTAAGTGCA GAGGGAGAGA AAATGCCTCC AACATGTGAG 

2451 GAAGTAATGA GAGAAATCAT AGAATTTCTT CCGCTTCCTC GCTCACTGAC 

2501 TCGCTGCGCT CGGTCGTTCG GCTGCGGCGA GCGGTATCAG CTCACTCAAA 

2551 GGCGGTAATA CGGTTATCCA CAGAATCAGG GGATAACGCA GGAAAGAACA 

2601 TGTGAGCAAA AGGCCAGCAA AAGGCCAGGA ACCGTAAAAA GGCCGCGTTG 

2651 CTGGCGTTTT TCCATAGGCT CCGCCCCCCT GACGAGCATC ACAAAAATCG 

2701 ACGCTCAAGT CAGAGGTGGC GAAACCCGAC AGGACTATAA AGATACCAGG 

2751 CGTTTCCCCC TGGAAGCTCC CTCGTGCGCT CTCCTGTTCC GACCCTGCCG 

2801 CTTACCGGAT ACCTGTCCGC CTTTCTCCCT TCGGGAAGCG TGGCGCTTTC 

2851 TCATAGCTCA CGCTGTAGGT ATCTCAGTTC GGTGTAGGTC GTTCGCTCCA 

2901 AGCTGGGCTG TGTGCACGAA CCCCCCGTTC AGCCCGACCG CTGCGCCTTA 

2951 TCCGGTAACT ATCGTCTTGA GTCCAACCCG GTAAGACACG ACTTATCGCC 

3001 ACTGGCAGCA GCCACTGGTA ACAGGATTAG CAGAGCGAGG TATGTAGGCG 

3051 GTGCTACAGA GTTCTTGAAG TGGTGGCCTA ACTACGGCTA CACTAGAAGA 

3101 ACAGTATTTG GTATCTGCGC TCTGCTGAAG CCAGTTACCT TCGGAAAAAG 

3151 AGTTGGTAGC TCTTGATCCG GCAAACAAAC CACCGCTGGT AGCGGTGGTT 

3201 TTTTTGTTTG CAAGCAGCAG ATTACGCGCA GAAAAAAAGG ATCTCAAGAA 

3251 GATCCTTTGA TCTTTTCTAC GGGGTCTGAC GCTCAGTGGA ACGAAAACTC 

3301 ACGTTAAGGG ATTTTGGTCA TGAGATTATC AAAAAGGATC TTCACCTAGA 

3351 TCCTTTTAAA TTAAAAATGA AGTTTTAAAT CAATCTAAAG TATATATGAG 

3401 TAAACTTGGT CTGACAGTTA CCAATGCTTA ATCAGTGAGG CACCTATCTC 

3451 AGCGATCTGT CTATTTCGTT CATCCATAGT TGCCTGACTC GGGGGGGGGG 

3501 GGCGCTGAGG TCTGCCTCGT GAAGAAGGTG TTGCTGACTC ATACCAGGCC 

3551 TGAATCGCCC CATCATCCAG CCAGAAAGTG AGGGAGCCAC GGTTGATGAG 

3601 AGCTTTGTTG TAGGTGGACC AGTTGGTGAT TTTGAACTTT TGCTTTGCCA 

3651 CGGAACGGTC TGCGTTGTCG GGAAGATGCG TGATCTGATC CTTCAACTCA 

3701 GCAAAAGTTC GATTTATTCA ACAAAGCCGC CGTCCCGTCA AGTCAGCGTA 

3751 ATGCTCTGCC AGTGTTACAA CCAATTAACC AATTCTGATT AGAAAAACTC 

3801 ATCGAGCATC AAATGAAACT GCAATTTATT CATATCAGGA TTATCAATAC 

3 851 CATATTTTTG AAAAAGCCGT TTCTGTAATG AAGGAGAAAA CTCACCGAGG 

3901 CAGTTCCATA GGATGGCAAG ATCCTGGTAT CGGTCTGCGA TTCCGACTCG 

3 951 TCCAACATCA ATACAACCTA TTAATTTCCC CTCGTCAAAA ATAAGGTTAT 

4001 CAAGTGAGAA ATCACCATGA GTGACGACTG AATCCGGTGA GAATGGCAAA 

4051 AGCTTATGCA TTTCTTTCCA GACTTGTTCA ACAGGCCAGC CATTACGCTC 

4101 GTCATCAAAA TCACTCGCAT CAACCAAACC GTTATTCATT CGTGATTGCG 

4151 CCTGAGCGAG ACGAAATACG CGATCGCTGT TAAAAGGACA ATTACAAACA 
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4201 


GGAATCGAAT 


GCAACCGGCG 


4251 


TTCACCTGAA 


TCAGGATATT 


4301 


GGATCGCAGT 


GGTGAGTAAC 


4351 


TTGATGGTCG 


GAAGAGGCAT 


4401 


CTCATCTGTA 


ACATCATTGG 


4451 


ACTCTGGCGC 


ATCGGGCTTC 


4501 


TGCCCGACAT 


TATCGCGAGC 


4551 


GTTGGAATTT 


AATCGCGGCC 


4601 


TCATAACACC 


CCTTGTATTA 


4651 


CATGATGATA 


TATTTTTATC 


4701 


CACAACGTGG 


CTTTCCCCCC 


4751 


ATTGTCTCAT 


GAGCGGATAC 


.4801 


ATAGGGGTTC 


CGCGCACATT 


4851 


AACCATTATT 


ATCATGACAT 


4901 


CCTTTCGTC 
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CAGGAACACT GCCAGCGCAT CAACAATATT 
CTTCTAATAC CTGGAATGCT GTTTTCCCGG 
CATGCATCAT CAGGAGTACG GATAAAATGC 
AAATTCCGTC AGCCAGTTTA GTCTGACCAT 
CAACGCTACC TTTGCCATGT TTCAGAAACA 
CCATACAATC GATAGATTGT CGCACCTGAT 
CCATTTATAC CCATATAAAT CAGCATCCAT 
TCGAGCAAGA CGTTTCCCGT TGAATATGGC 
CTGTTTATGT AAGCAGACAG TTTTATTGTT 
TTGTGCAATG TAACATCAGA GATTTTGAGA 
CCCCCCATTA TTGAAGCATT TATCAGGGTT 
ATATTTGAAT GTATTTAGAA AAATAAACAA 
TCCCCGAAAA GTGCCACCTG ACGTCTAAGA 
TAACCTATAA AAATAGGCGT ATCACGAGGC 
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1 CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT 
61 TTGTGACGTG GCGCGGGGCG TGGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT 
121 GATGTTGTAA GTGTGGCGGA ACACATGTAA GCGCCGGATG TGGTAAAAGT GACGTTTTTG 
181 GTGTGCGCCG GTGTACACGG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 
241 TAAATTTGGG CGTAACCAAG TAATATTTGG CCATTTTCGC GGGAAAACTG AATAAGAGGA 
301 AGTGAAATCT GAATAATTCT GTGTTACTCA TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
361 GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
421 CGGGTCAAAG TTGGCGTTTT ATTATTATAG TCAGCTGACG CGCAGTGTAT TTATACCCGG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
601 AATGGCCGCC AGTCTTTTGG ACCAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GAGTCTGTAA TGTTGGCGGT 
781 GCAGGAAGGG ATTGACTTAT TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGCCTTG GGTCCGGTTT CTATGCCAAA 
901 CCTTGTGCCG GAGGTGATCG ATCTTACCTG CCACGAGGCT GGCTTTCCAC CCAGTGACGA 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAAATTATGG GCAGTGGGTG 
1141 ATAGAGTGGT GGGTTTGGTG TGGTAATTTT TTTTTTAATT TTTACAGTTT TGTGGTTTAA 
1201 AGAATTTTGT ATTGTGATTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 
1261 CCAGAACCGG AGCCTGCAAG ACCTACCCGG CGTCCTAAAT TGGTGCCTGC TATCCTGAGA 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 
1381 CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 
1441 GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 
1501 TCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG 
1801 TTTCTGTGGG GCTCCTCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 
2041 AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT GGTGAGACAC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCAA TAATACCGAC GGAGGAGCAA 
2161 CAGCAGGAGG AAGCCAGGCG GCGGCGGCGG CAGGAGCAGA GCCCATGGAA CCCGAGAGCC 
2221 GGCCTGGACC CTCGGGAATG AATGTTGTAC AGGTGGCTGA ACTGTTTCCA GAACTGAGAC 
2281 GCATTTTAAC CATTAACGAG GATGGGCAGG GGCTAAAGGG GGTAAAGAAG GAGCGGGGGG 
2341 CTTCTGAGGC TACAGAGGAG GCTAGGAATC TAACTTTTAG CTTAATGACC AGACACCGTC 
2401 CTGAGTGTGT TACTTTTCAG CAGATTAAGG ATAATTGCGC TAATGAGCTT GATCTGCTGG 
2461 CGCAGAAGTA TTCCATAGAG CAGCTGACCA CTTACTGGCT GCAGCCAGGG GATGATTTTG 
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2521 AGGAGGCTAT TAGGGTATAT GCAAAGGTGG CACTTAGGCC AGATTGCAAG TACAAGATTA 
2581 GCAAACTTGT AAATATCAGG AATTGTTGCT ACATTTCTGG GAACGGGGCC GAGGTGGAGA 
2641 TAGATACGGA GGATAGGGTG GCCTTTAGAT GTAGCATGAT AAATATGTGG CCGGGGGTGC 
2701 TTGGCATGGA CGGGGTGGTT ATTATGAATG TGAGGTTTAC TGGTCCCAAT TTTAGCGGTA 
2761 CGGTTTTCCT GGCCAATACC AATCTTATCC TACACGGTGT AAGCTTC TAT GGGTTTAACA 
2821 ATACCTGTGT GGAAGCCTGG ACCGATGTAA GGGTTCGGGG CTGTGCCTTT TACTGCTGCT 
2881 GGAAGGGGGT GGTGTGTCGC CCCAAAAGCA GGGCTTCAAT TAAGAAATGC CTGTTTGAAA 
2941 GGTGTACCTT GGGTATCCTG TCTGAGGGTA ACTCCAGGGT GCGCCACAAT GTGGCCTCCG 
3001 ACTGTGGTTG CTTTATGCTA GTGAAAAGCG TGGCTGTGAT TAAGCATAAC ATGGTGTGTG 
3061 GCAACTGCGA GGACAGGGCC TCTCAGATGC TGACCTGCTC GGACGGCAAC TGTCACTTGC 
3121 TGAAGACCAT TCACGTAGCC AGCCACTCTC GCAAGGCCTG GCCAGTGTTT GAGCACAACA 
3181 TACTGACCCG CTGTTCCTTG CATTTGGGTA ACAGGAGGGG GGTGTTCCTA CCTTACCAAT 
3241 GCAATTTGAG TCACACTAAG ATATTGCTTG AGCCCGAGAG CATGTCCAAG GTGAACCTGA 
3301 ACGGGGTGTT TGACATGACC ATGAAGATCT GGAAGGTGCT GAGGTACGAT GAGACCCGCA 
33 61 CCAGGTGCAG ACCCTGCGAG TGTGGCGGTA AACATATTAG GAACCAGCCT GTGATGCTGG 
3421 ATGTGACCGA GGAGCTGAGG CCCGATCACT TGGTGCTGGC CTGCACCCGC GCTGAGTTTG 
3481 GCTCTAGCGA TGAAGATACA GATTGAGGTA CTGAAATGTG TGGGCGTGGC TTAAGGGTGG 
3541 GAAAGAATAT ATAAGGTGGG GGTCTCATGT AGTTTTGTAT CTGTTTTGCA GCAGCCGCCG 
3601 CCATGAGCGC CAACTCGTTT GATGGAAGCA TTGTGAGCTC ATATTTGACA ACGCGCATGC 
3661 CCCCATGGGC CGGGGTGCGT CAGAATGTGA TGGGCTCCAG CATTGATGGT CGCCCCGTCC 
3721 TGCCCGCAAA CTCTACTACC TTGACCTACG AGACCGTGTC TGGAACGCCG TTGGAGACTG 
3781 CAGCCTCCGC CGCCGCTTCA GCCGCTGCAG CCACCGCCCG CGGGATTGTG ACTGACTTTG 
3841 CTTTCCTGAG CCCGCTTGCA AGCAGTGCAG CTTCCCGTTC ATCCGCCCGC GATGACAAGT 
3901 TGACGGCTCT TTTGGCACAA TTGGATTCTT TGACCCGGGA ACTTAATGTC GTTTCTCAGC 
3961 AGCTGTTGGA TCTGCGCCAG CAGGTTTCTG CCCTGAAGGC TTCCTCCCCT CCCAATGCGG 
4021 TTTAAAACAT AAATAAAAAC CAGACTCTGT TTGGATTTGG ATCAAGCAAG TGTCTTGCTG 
4081 TCTTTATTTA GGGGTTTTGC GCGCGCGGTA GGCCCGGGAC CAGCGGTCTC GGTCGTTGAG 
4141 GGTCCTGTGT ATTTTTTCCA GGACGTGGTA AAGGTGACTC TGGATGTTCA GATACATGGG 
4201 CATAAGCCCG TCTCTGGGGT GGAGGTAGCA CCACTGCAGA GCTTCATGCT GCGGGGTGGT 
4261 GTTGTAGATG ATCCAGTCGT AGCAGGAGCG CTGGGCGTGG TGCCTAAAAA TGTCTTTCAG 
4321 TAGCAAGCTG ATTGCCAGGG GCAGGCCCTT GGTGTAAGTG TTTACAAAGC GGTTAAGCTG 
4381 GGATGGGTGC ATACGTGGGG ATATGAGATG CATCTTGGAC TGTATTTTTA GGTTGGCTAT 
4441 GTTCCCAGCC ATATCCCTCC GGGGATTCAT GTTGTGCAGA ACCACCAGCA CAGTGTATCC 
4501 GGTGCACTTG GGAAATTTGT CATGTAGCTT AGAAGGAAAT GCGTGGAAGA ACTTGGAGAC 
4561 GCCCTTGTGA CCTCCAAGAT TTTCCATGCA TTCGTCCATA ATGATGGCAA TGGGCCCACG 
4621 GGCGGCGGCC TGGGCGAAGA TATTTCTGGG ATCACTAACG TCATAGTTGT GTTCCAGGAT 
4681 GAGATCGTCA TAGGCCATTT TTACAAAGCG CGGGCGGAGG GTGCCAGACT GCGGTATAAT 
4741 GGTTCCATCC GGCCCAGGGG CGTAGTTACC CTCACAGATT TGCATTTCCC ACGCTTTGAG 
4801 TTCAGATGGG GGGATCATGT CTACCTGCGG GGCGATGAAG AAAACCGTTT CCGGGGTAGG 
4861 GGAGATCAGC TGGGAAGAAA GCAGGTTCCT AAGCAGCTGC GACTTACCGC AGCCGGTGGG 
4921 CCCGTAAATC ACACCTATTA CCGGCTGCAA CTGGTAGTTA AGAGAGCTGC AGCTGCCGTC 
4981 ATCCCTGAGC AGGGGGGCCA CTTCGTTAAG CATGTCCCTG ACTTGCATGT TTTCCCTGAC 
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5041 CAAATCCGCC AGAAGGCGCT CGCCGCCCAG CGATAGCAGT TCTTGCAAGG AAGCAAAGTT 
5101 TTTCAACGGT TTGAGGCCGT CCGCCGTAGG CATGCTTTTG AGCGTTTGAC CAAGCAGTTC 
5161 CAGGCGGTCC CACAGCTCGG TCACGTGCTC TACGGCATCT CGATCCAGCA TATCTCCTCG 
5221 TTTCGCGGGT TGGGGCGGCT TTCGCTGTAC GGCAGTAGTC GGTGCTCGTC CAGACGGGCC 
52 81 AGGGTCATGT CTTTCC AC GG GCGCAGGGTC CTCGTCAGCG TAGTCTGGGT CACGGTGAAG 
5341 GGGTGCGCTC CGGGTTGCGC GCTGGCCAGG GTGCGCTTGA GGCTGGTCCT GCTGGTGCTG 
5401 AAGCGCTGCC GGTCTTCGCC CTGCGCGTCG GCCAGGTAGC ATTTGACCAT GGTGTCATAG 
5461 TCCAGCCCCT CCGCGGCGTG GCCCTTGGCG CGCAGCTTGC CCTTGGAGGA GGCGCCGCAC 
5521 GAGGGGCAGT GCAGACTTTT AAGGGCGTAG AGCTTGGGCG CGAGAAATAC CGATTCCGGG 
5581 GAGTAGGCAT CCGCGCCGCA GGCCCCGCAG ACGGTCTCGC ATTCCACGAG CCAGGTGAGC 
5641 TCTGGCCGTT CGGGGTCAAA AACCAGGTTT CCCCCATGCT TTTTGATGCG TTTCTTACCT 
5701 CTGGTTTCCA TGAGCCGGTG TCCACGCTCG GTG AC GAAAA GGCTGTCCGT GTCCCCGTAT 
5761 ACAGACTTGA GAGGCCTGTC CTCGAGCGGT GTTCCGCGGT CCTCCTCGTA TAGAAACTCG 
5821 GACCACTCTG AGACGAAGGC TCGCGTCCAG GCCAGCACGA AGGAGGCTAA GTGGGAGGGG 
5881 TAGCGGTCGT TGTCCACTAG GGGGTCCACT CGCTCCAGGG TGTGAAGACA CATGTCGCCC 
5941 TCTTCGGCAT CAAGGAAGGT GATTGGTTTA TAGGTGTAGG CCACGTGACC GGGTGTTCCT 
6001 GAAGGGGGGC TATAAAAGGG GGTGGGGGCG CGTTCGTCCT CACTCTCTTC CGCATCGCTG 
6061 TCTGCGAGGG CCAGCTGTTG GGGTGAGTAC TCCCTCTCAA AAGCGGGCAT GACTTCTGCG 
6121 CTAAGATTGT CAGTTTCCAA AAACGAGGAG GATTTGATAT TCACCTGGCC CGCGGTGATG 
6181 CCTTTGAGGG TGGCCGCGTC CATCTGGTCA GAAAAGACAA TCTTTTTGTT GTCAAGCTTG 
6241 GTGGCAAACG ACCCGTAGAG GGCGTTGGAC AGCAACTTGG CGATGGAGCG CAGGGTTTGG 
6301 TTTTTGTCGC GATCGGCGCG CTCCTTGGCC GCGATGTTTA GCTGCACGTA TTCGCGCGCA 
63 61 ACGCACCGCC ATTCGGGAAA GACGGTGGTG CGCTCGTCGG GCACTAGGTG CACGCGCCAA 
6421 CCGCGGTTGT GCAGGGTGAC AAGGTCAACG CTGGTGGCTA CCTCTCCGCG TAGGCGCTCG 
6481 TTGGTCCAGC AGAGGCGGCC GCCCTTGCGC GAGCAGAATG GCGGTAGTGG GTCTAGCTGC 
6541 GTCTCGTCCG GGGGGTCTGC GTCCACGGTA AAGACCCCGG GCAGCAGGCG CGCGTCGAAG 
6601 TAGTCTATCT TGCATCCTTG CAAGTCTAGC GCCTGCTGCC ATGCGCGGGC GGCAAGCGCG 
6661 CGCTCGTATG GGTTGAGTGG GGGACCCCAT GGCATGGGGT GGGTGAGCGC GGAGGCGTAC 
6721 ATGCCGCAAA TGTCGTAAAC GTAGAGGGGC TCTCTGAGTA TTCCAAGATA TGTAGGGTAG 
6781 CATCTTCCAC CGCGGATGCT GGCGCGCACG TAATCGTATA GTTCGTGCGA GGGAGCGAGG 
6841 AGGTCGGGAC CGAGGTTGCT ACGGGCGGGC TGCTCTGCTC GGAAGACTAT CTGCCTGAAG 
6901 ATGGCATGTG AGTTGGATGA TATGGTTGGA CGCTGGAAGA CGTTGAAGCT GGCGTCTGTG 
6961 AGACCTACCG CGTCACGCAC GAAGGAGGCG TAGGAGTCGC GCAGCTTGTT GACCAGCTCG 
7021 GCGGTGACCT GCACGTCTAG GGCGCAGTAG TCCAGGGTTT CCTTGATGAT GTCATACTTA 
7081 TCCTGTCCCT TTTTTTTCCA CAGCTCGCGG TTGAGGACAA ACTCTTCGCG GTCTTTCCAG 
7141 TACTCTTGGA TCGGAAACCC GTCGGCCTCC GAACGGTAAG AGCCTAGCAT GTAGAACTGG 
7201 TTGACGGCCT GGTAGGCGCA GCATCCCTTT TCTACGGGTA GCGCGTATGC CTGCGCGGCC 
7261 TTCCGGAGCG AGGTGTGGGT GAGCGCAAAG GTGTCCCTAA CCATGACTTT GAGGTACTGG 
7321 TATTTGAAGT CAGTGTCGTC GCATCCGCCC TGCTCCCAGA GCAAAAAGTC CGTGCGCTTT 
7381 TTGGAACGCG GGTTTGGCAG GGCGAAGGTG ACATCGTTGA AGAGTATCTT TCCCGCGCGA 
7441 GGCATAAAGT TGCGTGTGAT GCGGAAGGGT CCCGGCACCT CGGAACGGTT GTTAATTACC 
7501 TGGGCGGCGA GCACGATCTC GTCAAAGCCG TTGATGTTGT GGCCCACAAT GTAAAGTTCC 
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7561 AAGAAGCGCG GGATGCCCTT GATGGAAGGC AATTTTTTAA GTTCCTCGTA GGTGAGCTCT 
7621 TCAGGGGAGC TGAGCCCGTG CTCTGAAAGG GCCCAGTCTG CAAGATGAGG GTTGGAAGCG 
7681 ACGAATGAGC TCCACAGGTC ACGGGCCATT AGCATTTGCA GGTGGTCGCG AAAGGTCCTA 
7741 AACTGGCGAC CTATGGCCAT TTTTTCTGGG GTGATGCAGT AGAAGGTAAG CGGGTCTTGT 
7801 TCCCAGCGGT CCCATCCAAG GTCCGCGGCT AGGTCTCGCG CGGCGGTCAC TAGAGGCTCA 
78 61 TCTCCGCCGA ACTTCATGAC CAGCATGAAG GGCACGAGCT GCTTCCCAAA GGCCCCCATC 
7921 CAAGTATAGG TCTCTACATC GTAGGTGACA AAGAGACGCT CGGTGCGAGG ATGCGAGCCG 
7981 ATCGGGAAGA ACTGGATCTC CCGCCACCAG TTGGAGGAGT GGCTGTTGAT GTGGTGAAAG 
8041 TAGAAGTCCC TGCGACGGGC CGAACACTCG TGCTGGCTTT TGTAAAAACG TGCGCAGTAC 
8101 TGGCAGCGGT GCACGGGCTG TACATCCTGC ACGAGGTTGA CCTGACGACC GCGCACAAGG 
8161 AAGCAGAGTG GGAATTTGAG CCCCTCGCCT GGCGGGTTTG GCTGGTGGTC TTCTACTTCG 
8221 GCTGCTTGTC CTTGACCGTC TGGCTGCTCG AGGGGAGTTA CGGTGGATCG GACCACCACG 
8281 CCGCGCGAGC CCAAAGTCCA GATGTCCGCG CGCGGCGGTC GGAGCTTGAT GACAACATCG 
8341 CGCAGATGGG AGCTGTCCAT GGTCTGGAGC TCCCGCGGGG TCAGGTCAGG CGGGAGCTCC 
8401 TGCAGGTTTA CCTCGCATAG CCGGGTCAGG GCGCGGGCTA GGTCCAGGTG ATACCTGATT 
8461 TCCAGGGGCT GGTTGGTGGC GGCGTCGATG GCTTGCAAGA GGCCGCATCC CCGCGGCGCG 
8521 ACTACGGTAC CGCGCGGCGG GCGGTGGGCC GCGGGGGTGT CCTTGGATGA TGCATCTAAA 
8581 AGCGGTGACG CGGGCGGGCC CCCGGAGGTA GGGGGGGCTC GGGACCCGCC GGGAGAGGGG 
8641 GCAGGGGCAC GTCGGCGCCG CGCGCGGGCA GGAGCTGGTG CTGCGCGCGG AGGTTGCTGG 
8701 CGAACGCGAC GACGCGGCGG TTGATCTCCT GAATCTGGCG CCTCTGCGTG AAGACGACGG 
8761 GCCCGGTGAG CTTGAACCTG AAAGAGAGTT CGACAGAATC AATTTCGGTG TCGTTGACGG 
8821 CGGCCTGGCG CAAAATCTCC TGCACGTCTC CTGAGTTGTC TTGATAGGCG ATCTCGGCCA 
8881 TGAACTGCTC GATCTCTTCC TCCTGGAGAT CTCCGCGTCC GGCTCGCTCC ACGGTGGCGG 
8941 CGAGGTCGTT GGAGATGCGG GCCATGAGCT GCGAGAAGGC GTTGAGGCCT CCCTCGTTCC 
9001 AGACGCGGCT GTAGACCACG CCCCCTTCGG CATCGCGGGC GCGCATGACC ACCTGCGCGA 
9061 GATTGAGCTC CACGTGCCGG GCGAAGACGG CGTAGTTTCG CAGGCGCTGA AAGAGGTAGT 
9121 TGAGGGTGGT GGCGGTGTGT TCTGCCACGA AGAAGTACAT AACCCAGCGC CGCAACGTGG 
9181 ATTCGTTGAT ATCCCCCAAG GCCTCAAGGC GCTCCATGGC CTCGTAGAAG TCCACGGCGA 
9241 AGTTGAAAAA CTGGGAGTTG CGCGCCGACA CGGTTAACTC CTCCTCCAGA AGACGGATGA 
9301 GCTCGGCGAC AGTGTCGCGC ACCTCGCGCT CAAAGGCTAC AGGGGCCTCT TCTTCTTCTT 
9361 CAATCTCCTC TTCCATAAGG GCCTCCCCTT CTTCTTCTTC TGGCGGCGGT GGGGGAGGGG 
9421 GGACACGGCG GCGACGACGG CGCACCGGGA GGCGGTCGAC AAAGCGCTCG ATCATCTCCC 
9481 CGCGGCGACG GCGCATGGTC TCGGTGACGG CGCGGCCGTT CTCGCGGGGG CGCAGTTGGA 
9541 AGACGCCGCC CGTCATGTCC CGGTTATGGG TTGGCGGGGG GCTGCCGTGC GGCAGGGATA 
9601 CGGCGCTAAC GATGCATCTC AACAATTGTT GTGTAGGTAC TCCGCCACCG AGGGACCTGA 
9661 GCGAGTCCGC ATCGACCGGA TCGGAAAACC TCTCGAGAAA GGCGTCTAAC CAGTCACAGT 
9721 CGCAAGGTAG GCTGAGCACC GTGGCGGGCG GCAGC GGGCG GCGGTCGGGG TTGTTTCTGG 
9781 CGGAGGTGCT GCTGATGATG TAATTAAAGT AGGCGGTCTT GAGAC GGCGG ATGGTCGACA 
9841 GAAGCACCAT GTCCTTGGGT CCGGCCTGCT GAATGCGCAG GCGGTCGGCC ATGCCCCAGG 
9901 CTTCGTTTTG ACATCGGCGC AGGTCTTTGT AGTAGTCTTG CATGAGCCTT TCTACCGGCA 
9961 CTTCTTCTTC TCCTTCCTCT TGTCCTGCAT CTCTTGCATC TATCGCTGCG GCGGCGGCGG 
10021 AGTTTGGCCG TAGGTGGCGC CCTCTTCCTC CCATGCGTGT GACCCCGAAG CCCCTCATCG 
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10081 GCTGAAGCAG GGCCAGGTCG GCGACAACGC GCTCGGCTAA TATGGCCTGC TGCACCTGCG 
10141 TGAGGGTAGA CTGGAAGTCG TCCATGTCCA CAAAGCGGTG GTATGCGCCC GTGTTGATGG 
10201 TGTAAGTGCA GTTGGCCATA ACGGACCAGT TAACGGTCTG GTGACCCGGC TGCGAGAGCT 
10261 CGGTGTACCT GAGACGCGAG TAAGCCCTTG AGTCAAAGAC GTAGTCGTTG CAAGTCCGCA 
10321 CCAGGTACTG GTATCCCACC AAAAAGTGCG GCGGCGGCTG GCGGTAGAGG GGCCAGCGTA 
10381 GGGTGGCCGG GGCTCCGGGG GCGAGGTCTT CCAACATAAG GCGATGATAT CCGTAGATGT 
10441 ACCTGGACAT CCAGGTGATG CCGGCGGCGG TGGTGGAGGC GCGCGGAAAG TCACGGACGC 
10501 GGTTCCAGAT GTTGCGCAGC GGCAAAAAGT GCTCCATGGT CGGGACGCTC TGGCCGGTCA 
10561 GGCGCGCGCA GTCGTTGACG CTCTAGACCG TGCAAAAGGA GAGCCTGTAA GCGGGCACTC 
10621 TTCCGTGGTC TGGTGGATAA ATTCGCAAGG GTATCATGGC GGACGACCGG GGTTCGAACC 
10681 CCGGATCCGG CCGTCCGCCG TGATCCATGC GGTTACCGCC CGCGTGTCGA ACCCAGGTGT 
10741 GCGACGTCAG ACAACGGGGG AGCGCTCCTT TTGGCTTCCT TCCAGGCGCG GCGGATGCTG 
10801 CGCTAGCTTT TTTGGCCACT GGCCGCGCGC GGCGTAAGCG GTTAGGCTGG AAAGCGAAAG 
10861 CATTAAGTGG CTCGCTCCCT GTAGCCGGAG GGTTATTTTC CAAGGGTTGA GTCGCGGGAC 
10921 CCCCGGTTCG AGTCTCGGGC CGGCCGGACT GCGGCGAACG GGGGTTTGCC TCCCCGTCAT 
10981 GCAAGACCCC GCTTGCAAAT TCCTCCGGAA ACAGGGACGA GCCCCTTTTT TGCTTTTCCC 
11041 AGATGCATCC GGTGCTGCGG CAGATGCGCC CCCCTCCTCA GCAGCGGCAA GAGCAAGAGC 
11101 AGCGGCAGAC ATGCAGGGCA CCCTCCCCTT CTCCTACCGC GTCAGGAGGG GCAACATCCG 
11161 CGGCTGACGC GGCGGCAGAT GGTGATTACG AACCCCCGCG GCGCCGGACC CGGCACTACT 
11221 TGGACTTGGA GGAGGGCGAG GGCCTGGCGC GGCTAGGAGC GCCCTCTCCT GAGCGACACC 
11281 CAAGGGTGCA GCTGAAGCGT GACACGCGCG AGGCGTACGT GCCGCGGCAG AACCTGTTTC 
11341 GCGACCGCGA GGGAGAGGAG CCCGAGGAGA TGCGGGATCG AAAGTTCCAT GCAGGGCGCG 
11401 AGTTGCGGCA TGGCCTGAAC CGCGAGCGGT TGCTGCGCGA GGAGGACTTT GAGCCCGACG 
11461 CGCGGACCGG GATTAGTCCC GCGCGCGCAC ACGTGGCGGC CGCCGACCTG GTAACCGCGT 
11521 ACGAGCAGAC GGTGAACCAG GAGATTAACT TTCAAAAAAG CTTTAACAAC CACGTGCGCA 
11581 CGCTTGTGGC GCGCGAGGAG GTGGCTATAG GACTGATGCA TCTGTGGGAC TTTGTAAGCG 
11641 CGCTGGAGCA AAACCCAAAT AGCAAGCCGC TCATGGCGCA GCTGTTCCTT ATAGTGCAGC 
11701 ACAGCAGGGA CAACGAGGCA TTCAGGGATG CGCTGCTAAA CATAGTAGAG CCCGAGGGCC 
11761 GCTGGCTGCT CGATTTGATA AACATTCTGC AGAGCATAGT GGTGCAGGAG CGCAGCTTGA 
11821 GCCTGGCTGA CAAGGTGGCC GCCATTAACT ATTCCATGCT CAGTCTGGGC AAGTTTTACG 
11881 CCCGCAAGAT ATACCATACC CCTTACGTTC CCATAGACAA GGAGGTAAAG ATCGAGGGGT 
11941 TCTACATGCG CATGGCGCTG AAGGTGCTTA CCTTGAGCGA CGACCTGGGC GTTTATCGCA 
12001 ACGAGCGCAT CCACAAGGCC GTGAGCGTGA GCCGGCGGCG CGAGCTCAGC GACCGCGAGC 
12061 TGATGCACAG CCTGCAAAGG GCCCTGGCTG GCACGGGCAG CGGCGATAGA GAGGCCGAGT 
12121 CCTACTTTGA CGCGGGCGCT GACCTGCGCT GGGCCCCAAG CCGACGCGCC CTGGAGGCAG 
12181 CTGGGGCCGG ACCTGGGCTG GCGGTGGCAC CCGCGCGCGC TGGCAACGTC GGCGGCGTGG 
12241 AGGAATATGA CGAGGACGAT GAGTACGAGC CAGAGGACGG CGAGTACTAA GCGGTGATGT 
12301 TTCTGATC AG ATGATGCAAG ACGCAACGGA CCCGGCGGTG CGGGCGGCGC TGCAGAGCCA 
12361 GCCGTCCGGC CTTAACTCCA CGGACGACTG GCGCCAGGTC ATGGACCGCA TCATGTCGCT 
12421 GACTGCGCGC AACCCTGACG CGTTCCGGCA GCAGCCGCAG GCCAACCGGC TCTCCGCAAT 
12481 TCTGGAAGCG GTGGTCCCGG CGCGCGCAAA CCCCACGCAC GAGAAGGTGC TGGCGATCGT 
12541 AAACGCGCTG GCCGAAAACA GGGCCATCCG GCCCGATGAG GCCGGCCTGG TCTACGACGC 
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15121 GGCAATGCAG CCGGTGGAGG ACATGAACGA TCATGCCATT CGCGGCGACA CCTTTGCCAC 
15181 ACGGGCGGAG GAGAAGCGCG CTGAGGCCGA GGCAGCGGCC GAAGCTGCCG CCCCCGCTGC 
15241 GGAGGCTGCA CAACCCGAGG TCGAGAAGCC TCAGAAGAAA CCGGTGATTA AACCCCTGAC 
153 01 AGAGGACAGC AAGAAACGCA GTTACAACCT AATAAGCAAT GACAGCACCT TCACCCAGTA 
153 61 CCGCAGCTGG TACCTTGCAT ACAACTACGG CGACCCTCAG GCCGGGATCC GCTCATGGAC 
15421 CCTGCTTTGC ACTCCTGACG TAACCTGCGG CTCGGAGCAG GTATACTGGT CGTTGCCCGA 
15481 CATGATGCAA GACCCCGTGA CCTTCCGCTC CACGCGCCAG ATCAGCAACT TTCCGGTGGT 
15541 GGGCGCCGAG CTGTTGCCCG TGCACTCCAA GAGCTTCTAC AACGACCAGG CCGTCTACTC 
15601 CCAGCTCATC CGCCAGTTTA CCTCTCTGAC CCACGTGTTC AATCGCTTTC CCGAGAACCA 
15661 GATTTTGGCG CGCCCGCCAG CCCCCACCAT CACCACCGTC AGTGAAAACG TTCCTGCTCT 
15721 CACAGATCAC GGGACGCTAC CGCTGCGCAA CAGCATCGGA GGAGTCCAGC GAGTGACCAT 
15781 TACTGACGCC AGACGCCGCA CCTGCCCCTA CGTTTACAAG GCCCTGGGCA TAGTCTCGCC 
15841 GCGCGTCCTA TCGAGCCGCA CTTTTTGAGC AAGCATGTCC ATCCTTATAT CGCCCAGCAA 
15901 TAACACAGGC TGGGGCCTGC GCTTCCCAAG CAAGATGTTT GGCGGGGCCA AGAAGCGCTC 
15961 CGACCAACAC CCAGTGCGCG TGCGCGGGCA CTACCGCGCG CCCTGGGGCG CGCACAAACG 
16021 CGGCCGCACT GGGCGCACCA CCGTCGATGA CGCCATCGAC GCGGTGGTGG AGGAGGCGCG 
16081 CAACTACACG CCCACGCCGC CGCCAGTGTC CACCGTGGAC GCGGCCATTC AGACCGTGGT 
16141 GCGCGGAGCC CGGCGCTACG CTAAAATGAA GAGACGGCGG AGGCGCGTAG CACGTCGCCA 
16201 CCGCCGCCGA CCCGGCACTG CCGCCCAACG CGCGGCGGCG GCCCTGCTTA ACCGCGCACG 
16261 TCGCACCGGC CGACGGGCGG CCATGCGAGC CGCTCGAAGG CTGGCCGCGG GTATTGTCAC 
16321 TGTGCCCCCC AGGTCCAGGC GACGAGCGGC CGCCGCAGCA GCCGCGGCCA TTAGTGCTAT 
16381 GACTCAGGGT CGCAGGGGCA ACGTGTACTG GGTGCGCGAC TCGGTTAGCG GCCTGCGCGT 
16441 GCCCGTGCGC ACCCGCCCCC CGCGCAACTA GATTGCAATA AAAAACTACT TAGACTCGTA 
16501 CTGTTGTATG TATCCAGCGG CGGCGGCGCG CATCGAAGCT ATGTCCAAGC GCAAAATCAA 
16561 AGAAGAGATG CTCCAGGTCA TCGCGCCGGA GATCTATGGC CCCCCGAAGA AGGAAGAGCA 
16621 GGATTACAAG CCCCGAAAGC TAAAGCGGGT CAAAAAGAAA AAGAAAGATG ATGATGATGA 
16681 TGAACTTGAC GACGAGGTGG AACTGTTGCA CGCGACCGCG CCCAGGCGAC GGGTACAGTG 
16741 GAAAGGTCGA CGCGTAAGAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 
16801 TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 
16861 GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 
16921 GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TGACACTGCA 
16981 GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 
17041 TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGT CAGCGACTGG AAGATGTCTT 
17101 GGAAAAAATG ACCGTGGAGC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 
17161 GGTGGCACCG GGACTGGGCG TGCAGACCGT GGACGTTCAG ATACCCACCA CCAGTAGCAC 
17221 TAGTATTGCC ACTGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG CCTCGGCGGT 
17281 GGCAGATGCC GCGGTGCAGG CGGCCGCTGC GGCCGCGTCC AAGACCTCTA CGGAGGTGCA 
17341 AACGGACCCG TGGATGTTTC GTGTTTCAGC CCCCCGGCGT CCGCGCCGTT CAAGGAAGTA 
17401 CGGCGCCGCC AGCGCGCTAC TGCCCGAATA TGCCCTACAT CCTTCCATCG CGCCTACCCC 
17461 CGGCTATCGT GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCCGAC GCCGAACCAC 
17521 CACTGGAACC CGCCGCCGCC GTCGCCGTCG CCAGCCCGTG CTGGCCCCGA TTTCCGTGCG 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT ACCACCCCAG 
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17 641 CATCGTTTAA AAGCCGGTCT TTGTGGTTCT TGCAGATATG GCCCTCACCT GCCGCCTCCG 
17701 TTTCCCGGTG CCGGGATTCC GAGGAAGAAT GCACCGTAGG AGGGGCATGG CCGGCCACGG 
17761 CCTGACGGGC GGCATGCGTC GTGCGCACCA CCGGCGGCGG CGCGCGTCGC ACCGTCGCAT 
17821 GCGCGGCGGT ATCCTGCCCC TCCTTATTCC ACTGATCGCC GCGGCGATTG GCGCCGTGCC 
17881 CGGAATTGCA TCCGTGGCCT TGCAGGCGCA GAGACACTGA TTAAAAACAA GTTACATGTG 
17 941 GAAAAATCAA AATAAAAGTC TGGACTCTCA CGCTCGCTTG GTCCTGTAAC TATTTTGTAG 
18001 AATGGAAGAC ATCAACTTTG CGTCACTGGC CCCGCGACAC GGCTCGCGCC CGTTCATGGG 
18061 AAACTGGCAA GATATCGGCA CCAGCAATAT GAGCGGTGGC GCCTTCAGCT GGGGCTCGCT 
18121 GTGGAGCGGC ATTAAAAATT TCGGTTCCGC CGTTAAGAAC TATGGCAGCA AAGCCTGGAA 
18181 CAGCAGCACA GGCCAGATGC TGAGGGACAA GTTGAAAGAG CAAAATTTCC AACAAAAGGT 
18241 GGTAGATGGC CTGGCCTCTG GCATTAGCGG GGTGGTGGAC CTGGCCAACC AGGCAGTGCA 
18301 AAATAAGATT AACAGTAAGC TTGATCCCCG CCCTCCCGTA GAGGAGCCTC CACCGGCCGT 
18361 GGAGACAGTG TCTCCAGAGG GGCGTGGCGA AAAGCGTCCG CGACCCGACA GGGAAGAAAC 
18421 TCTGGTGACG CAAATAGACG AGCCTCCCTC GTACGAGGAG GCACTAAAGC AAGGCCTGCC 
18481 CACCACCCGT CCCATCGCGC CCATGGCTAC CGGAGTGCTG GGCCAGCACA CACCCGTAAC 
18541 GCTGGACCTG CCTCCCCCCG CCGACACCCA GCAGAAACCT GTGCTGCCAG GCCCGTCCGC 
18601 CGTTGTTGTA ACCCGTCCTA GCCGCGCGTC CCTGCGCCGC GCCGCCAGCG GTCCGCGATC 
18661 GTTGCGGCCC GTAGCCAGTG GCAACTGGCA AAGCACACTG AACAGCATCG TGGGTTTGGG 
18721 GGTGCAATCC CTGAAGCGCC GACGATGCTT CTGATAGCTA ACGTGTCGTA TGTGTGTCAT 
18781 GTATGCGTCC ATGTCGCCGC CAGAGGAGCT GCTGAGCCGC CGCGCGCCCG CTTTCCAAGA 
18841 TGGCTACCCC TTCGATGATG CCGCAGTGGT CTTACATGCA CATCTCGGGC CAGGACGCCT 
18901 CGGAGTACCT GAGCCCCGGG CTGGTGCAGT TCGCCCGCGC CACCGAGACG TACTTCAGCC 
18961 TGAATAACAA GTTTAGAAAC CCCACGGTGG CGCCTACGCA CGACGTGACC ACAGACCGGT 
19021 CTCAGCGTTT GACGCTGCGG TTCATCCCCG TGGACCGCGA GGATACTGCG TACTCGTACA 
19081 AGGCGCGGTT CACCCTAGCT GTGGGTGATA ACCGTGTGCT AGACATGGCT TCCACGTACT 
19141 TTGACATCCG CGGCGTGCTG GACAGGGGCC CTACTTTTAA GCCCTACTCT GGCACTGCCT 
19201 ACAACGCACT GGCCCCCAAG GGTGCCCCCA ACTCGTGCGA GTGGGAACAA AATGAAACTG 
19261 CACAAGTGGA TGCTCAAGAA CTTGACGAAG AGGAGAATGA AGCCAATGAA GCTCAGGCGC 
19321 GAGAACAGGA ACAAGCTAAG AAAACCCATG TATATGCCCA GGCTCCACTG TCCGGAATAA 
19381 AAATAACTAA AGAAGGTCTA CAAATAGGAA CTGCCGACGC CACAGTAGCA GGTGCCGGCA 
19441 AAGAAATTTT CGCAGACAAA ACTTTTCAAC CTGAACCACA AGTAGGAGAA TCTCAATGGA 
19501 ACGAAGCGGA TGCCACAGCA GCTGGTGGAA GGGTTCTTAA AAAGACAACT CCCATGAAAC 
19561 CCTGCTATGG CTCATACGCT AGACCCACCA ATTCCAACGG CGGACAGGGC GTTATGGTTG 
19621 AACAAAATGG TAAATTGGAA AGTCAAGTCG AAATGCAATT TTTTTCCACA TCCACAAATG 
19681 CCACAAATGA AGTTAACAAT ATACAACCAA CAGTTGTATT GTACAGCGAA GATGTAAACA 
19741 TGGAAACTCC AGATACTCAT CTTTCTTATA AACCTAAAAT GGGGGATAAA AATGCCAAAG 
19801 TCATGCTTGG ACAACAAGCA ATGCCAAACA GACCAAATTA CATTGCTTTT AGAGACAATT 
19861 TTATTGGTCT CATGTATTAC AACAGCACAG GTAACATGGG TGTCCTTGCT GGTCAGGCAT 
19921 CGCAGTTGAA CGCTGTTGTA GATTTGCAAG ACAGAAACAC AGAGCTGTCC TACCAGCTTT 
19981 TGCTTGATTC AATTGGCGAC AGAACAAGAT ACTTTTCAAT GTGGAATCAA GCTGTTGACA 
20041 GCTATGATCC AGATGTCAGA ATTATTGAGA ACCATGGAAC TGAGGATGAG TTGCCAAATT 
20101 ATTGCTTTCC TCTTGGTGGA ATTGGGATTA CTGACACTTT TCAAGCTGTT AAAACAACTG 



FIG. 7H 



WO 03/031588 



PCT7US02/32512 



50/92 



20161 CTGCTAACGG GGACCAAGGC AATACTACCT GGCAAAAAGA TTCAACATTT GCAGAACGCA 
20221 ATGAAATAGG GGTGGGAAAT AACTTTGCCA TGGAAATTAA CCTGAATGCC AACCTATGGA 
20281 GAAATTTCCT TTACTCCAAT ATTGCGCTGT ACCTGCCAGA CAAGCTAAAA TACAACCCCA 
20341 CCAATGTGGA AATATCTGAC AACCCCAACA CCTACGACTA CATGAACAAG CGAGTGGTGG 
20401 CTCCTGGGCT TGTAGACTGC TACATTAACC TTGGGGCGCG CTGGTCTCTG GACTACATGG 
20461 ACAACGTTAA TCCCTTTAAC CACCACCGCA ATGCGGGCCT GCGTTACCGC TCCATGTTGT 
20521 TGGGAAACGG CCGCTACGTG CCCTTTCACA TTCAGGTGCC CCAAAAGTTT TTTGCCATTA 
20581 AAAACCTCCT CCTCCTGCCA GGCTCATACA CATATGAATG GAACTTCAGG AAGGATGTTA 
20641 ACATGGTTCT GCAGAGCTCT CTGGGAAACG ACCTTAGAGT TGACGGGGCT AGCATTAAGT 
20701 TTGACAGCAT TTGTCTTTAC GCCACCTTCT TCCCCATGGC CCACAACACG GCCTCCACGC 
20761 TGGAAGCCAT GCTCAGAAAT GACACCAACG ACCAGTCCTT TAATGACTAC CTTTCCGCCG 
20821 CCAACATGCT ATATCCCATA CCCGCCAACG CCACCAACGT GCCCATCTCC ATCCCATCGC 
20881 GCAACTGGGC AGCATTTCGC GGTTGGGCCT TCACACGCTT GAAGACAAAG GAAACCCCTT 
20941 CCCTGGGATC AGGCTACGAC CCTTACTACA CCTACTCTGG CTCCATACCA TACCTTGACG 
21001 GAACCTTCTA TCTTAATC AC ACCTTTAAGA AGGTGGCCAT TACTTTTGAC TCTTCTGTTA 
21061 GCTGGCCGGG CAACGACCGC CTGCTTACTC CCAATGAGTT TGAGATTAAG CGCTCAGTTG 
21121 ACGGGGAGGG CTATAACGTA GCTCAGTGCA ACATGACAAA GGACTGGTTC CTAGTGCAGA 
21181 TGTTGGCCAA CTACAATATT GGCTACCAGG GCTTCTACAT TCCAGAAAGC TACAAAGACC 
21241 GCATGTACTC GTTCTTCAGA AACTTCCAGC CCATGAGCCG GCAAGTGGTG GACGATACTA 
21301 AATACAAAGA TTATCAGCAG GTTGGAATTA TCCACCAGCA TAACAACTCA GGCTTCGTAG 
21361 GCTACCTCGC TCCCACCATG CGCGAGGGAC AAGCTTACCC CGCTAATGTT CCCTACCCAC 
21421 TAATAGGCAA AACCGCGGTT GATAGTATTA CCCAGAAAAA GTTTCTTTGC GACCGCACCC 
21481 TGTGGCGCAT CCCCTTCTCC AGTAACTTTA TGTCCATGGG TGCGCTCACA GACCTGGGCC 
21541 AAAACCTTCT CTACGCAAAC TCCGCCCACG CGCTAGACAT GACCTTTGAG GTGGATCCCA 
21601 TGGACGAGCC CACCCTTCTT TATGTTTTGT TTGAAGTCTT TGACGTGGTC CGTGTGCACC 
21661 AGCCGCACCG CGGCGTCATC GAGACCGTGT ACCTGCGCAC GCCCTTCTCG GCCGGCAACG 
21721 CCACAACATA AAGAAGCAAG CAACATCAAC AACAGCTGCC GCCATGGGCT CCAGTGAGCA 
21781 GGAACTGAAA GCCATTGTCA AAGATCTTGG TTGTGGGCCA TATTTTTTGG GCACCTATGA 
21841 CAAGCGCTTC CCAGGCTTTG TTTCCCCACA CAAGCTCGCC TGCGCCATAG TTAACACGGC 
21901 CGGTCGCGAG ACTGGGGGCG TACACTGGAT GGCCTTTGCC TGGAACCCGC GCTCAAAAAC 
21961 ATGCTACCTC TTTGAGCCCT TTGGCTTTTC TGACCAACGT CTCAAGCAGG TTTACCAGTT 
22021 TGAGTACGAG TCACTCCTGC GCCGTAGCGC CATTGCCTCT TCCCCCGACC GCTGTATAAC 
22081 GCTGGAAAAG TCCACCCAAA GCGTGCAGGG GCCCAACTCG GCCGCCTGTG GCCTATTCTG 
22141 CTGCATGTTT CTCCACGCCT TTGCCAACTG GCCCCAAACT CCCATGGATC ACAACCCCAC 
22201 CATGAACCTT ATTACCGGGG TACCCAACTC CATGCTTAAC AGTCCCCAGG TACAGCCCAC 
22261 CCTGCGCCGC AACCAGGAAC AGCTCTACAG CTTCCTGGAG CGCCACTCGC CCTACTTCCG 
22321 CAGCCACAGT GCGCAAATTA GGAGCGCCAC TTCTTTTTGT CACTTGAAAA ACATGTAAAA 
22381 ATAATGTACT AGGAGACACT TTCAATAAAG GCAAATGTTT TTATTTGTAC ACTCTCGGGT 
22441 GATTATTTAC CCCCACCCTT GCCGTCTGCG CCGTTTAAAA ATCAAAGGGG TTCTGCCGCG 
22501 CATCGCTATG CGCCACTGGC AGGGACACGT TGCGATACTG GTGTTTAGTG CTCCACTTAA 
22561 ACTCAGGCAC AACCATCCGC GGCAGCTCGG TGAAGTTTTC ACTCCACAGG CTGCGCACCA 
22621 TCACCAACGC GTTTAGCAGG TCGGGCGCCG ATATCTTGAA GTCGCAGTTG GGGCCTCCGC 



FIG. 71 



WO 03/031588 



PCTYUS02/32512 



51/92 



22681 CCTGCGCGCG CGAGTTGCGA TACACAGGGT TACAGCACTG GAACACTATC AGCGCCGGGT 
22741 GGTGCACGCT GGCCAGCACG CTCTTGTCGG AGATCAGATC CGCGTCCAGG TCCTCCGCGT 
22801 TGCTCAGGGC GAACGGAGTC AACTTTGGTA GCTGCCTTCC CAAAAAGGGT GCATGCCCAG 
22861 GCTTTGAGTT GCACTCGCAC CGTAGTGGCA TCAGAAGGTG ACCGTGCCCA GTCTGGGCGT 
22921 TAGGATACAG CGCCTGCATG AAAGCCTTGA TCTGCTTAAA AGCCACCTGA GCCTTTGCGC 
22981 CTTCAGAGAA GAACATGCCG CAAGACTTGC CGGAAAACTG ATTGGCCGGA CAGGCCGCGT 
23041 CATGCACGCA GCACCTTGCG TCGGTGTTGG AGATCTGCAC CACATTTCGG CCCCACCGGT 
23101 TCTTCACGAT CTTGGCCTTG CTAGACTGCT CCTTCAGCGC GCGCTGCCCG TTTTCGCTCG 
23161 TCACATCCAT TTCAATCACG TGCTCCTTAT TTATCATAAT GCTCCCGTGT AGACACTTAA 
23221 GCTCGCCTTC GATCTCAGCG CAGCGGTGCA GCCACAACGC GCAGCCCGTG GGCTCGTGGT 
23281 GCTTGTAGGT TACCTCTGCA AACGACTGCA GGTACGCCTG CAGGAATCGC CCCATCATCG 
23341 TCACAAAGGT CTTGTTGCTG GTGAAGGTCA GCTGCAACCC GCGGTGCTCC TCGTTTAGCC 
23401 AGGTCTTGCA TACGGCCGCC AGAGCTTCCA CTTGGTCAGG CAGTAGCTTG AAGTTTGCCT 
23461 TTAGATCGTT ATCCACGTGG TACTTGTCCA TCAACGCGCG CGCAGCCTCC ATGCCCTTCT 
23521 CCCACGCAGA CACGATCGGC AGGCTCAGCG GGTTTATCAC CGTGCTTTCA CTTTCCGCTT 
23581 CACTGGACTC TTCCTTTTCC TCTTGCATCC GCATACCCCG CGCCACTGGG TCGTCTTCAT 
23641 TCAGCCGCCG CACCGTGCGC TTACCTCCCT TGCCGTGCTT GATTAGCACC GGTGGGTTGC 
23701 TGAAACCCAC CATTTGTAGC GCCACATCTT CTCTTTCTTC CTCGCTGTCC ACGATCACCT 
23761 CTGGGGATGG CGGGCGCTCG GGCTTGGGAG AGGGGCGCTT CTTTTTCTTT TTGGACGCAA 
23821 TGGCCAAATC CGCCGTCGAG GTCGATGGCC GCGGGCTGGG TGTGCGCGGC ACCAGCGCAT 
23881 CTTGTGACGA GTCTTCTTCG TCCTCGGACT CGAGACGCCG CCTCAGCCGC TTTTTTGGGG 
23941 GCGCGCGGGG AGGCGGCGGC GACGGCGACG GGGACGAGAC GTCCTCCATG GTTGGTGGAC 
24001 GTCGCGCCGC ACCGCGTCCG CGCTCGGGGG TGGTTTCGCG CTGCTCCTCT TCCCGACTGG 
24061 CCATTTCCTT CTCCTATAGG CAGAAAAAGA TCATGGAGTC AGTCGAGAAG GAGGACAGCC 
24121 TAACCGCCCC CTTTGAGTTC GCCACCACCG CCTCCACCGA TGCCGCCAAC GCGCCTACCA 
24181 CCTTCCCCGT CGAGGCACCC CCGCTTGAGG AGGAGGAAGT GATTATCGAG CAGGACCCAG 
24241 GTTTTGTAAG CGAAGACGAC GAAGATCGCT CAGTACCAAC AGAGGATAAA AAGCAAGACC 
24301 AGGACGACGC AGAGGCAAAC GAGGAACAAG TCGGGCGGGG GGACCAAAGG CATGGCGACT 
24361 ACCTAGATGT GGGAGACGAC GTGCTGTTGA AGCATCTGCA GCGCCAGTGC GCCATTATCT 
24421 GCGACGCGTT GCAAGAGCGC AGCGATGTGC CCCTCGCCAT AGCGGATGTC AGCCTTGCCT 
24481 ACGAACGCCA CCTGTTCTCA CCGCGCGTAC CCCCCAAACG CCAAGAAAAC GGCACATGCG 
24541 AGCCCAACCC GCGCCTCAAC TTCTACCCCG TATTTGCCGT GCCAGAGGTG CTTGCCACCT 
24601 ATCACATCTT TTTCCAAAAC TGCAAGATAC CCCTATCCTG CCGTGCCAAC CGCAGCCGAG 
24661 CGGACAAGCA GCTGGCCTTG CGGCAGGGCG CTGTCATACC TGATATCGCC TCGCTCGACG 
24721 AAGTGCCAAA AATCTTTGAG GGTCTTGGAC GCGACGAGAA GCGCGCGGCA AACGCTCTGC 
24781 AACAAGAAAA CAGCGAAAAT GAAAGTCACT GTGGAGTGCT GGTGGAACTT GAGGGTGACA 
24841 ACGCGCGCCT AGCCGTGCTG AAACGCAGCA TCGAGGTCAC CCACTTTGCC TACCCGGCAC 
24901 TTAACCTACC CCCCAAGGTT ATGAGCACAG TCATGAGCGA GCTGATCGTG CGCCGTGCAC 
24961 GACCCCTGGA GAGGGATGCA AACTTGCAAG AACAAACCGA GGAGGGCCTA CCCGCAGTTG 
25021 GCGATGAGCA GCTGGCGCGC TGGCTTGAGA CGCGCGAGCC TGCCGACTTG GAGGAGCGAC 
25081 GCAAGCTAAT GATGGCCGCA GTGCTTGTTA CCGTGGAGCT TGAGTGCATG CAGCGGTTCT 
25141 TTGCTGACCC GGAGATGCAG CGCAAGCTAG AGGAAACGTT GCACTACACC TTTCGCCAGG 
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25201 GCTACGTGCG CCAGGCCTGC AAAATTTCCA ACGTGGAGCT CTGCAACCTG GTCTCCTACC 
25261 TTGGAATTTT GCACGAAAAC CGCCTTGGGC AAAACGTGCT TCATTCCACG CTCAAGGGCG 
25321 AGGCGCGCCG CGACTACGTC CGCGACTGCG TTTACTTATT TCTGTGCTAC ACCTGGCAAA 
25381 CGGCCATGGG CGTGTGGCAG CAGTGCCTGG AGGAGCGCAA CCTGAAGGAG CTGCAGAAGC 
25441 TGCTAAAGCA AAACTTGAAG GACCTATGGA CGGCCTTCAA CGAGCGCTCC GTGGCCGCGC 
25501 ACCTGGCGGA CATTATCTTC CCCGAACGCC TGCTTAAAAC CCTGCAACAG GGTCTGCCAG 
25561 ACTTCACCAG TCAAAGCATG TTGCAAAACT TTAGGAACTT TATCCTAGAG CGTTCAGGAA 
25621 TTCTGCCCGC CACCTGCTGT GCGCTTCCTA GCGACTTTGT GCCCATTAAG TACCGTGAAT 
25681 GCCCTCCGCC GCTTTGGGGT CACTGCTACC TTCTGCAGCT AGCCAACTAC CTTGCCTACC 
25741 ACTCCGACAT CATGGAAGAC GTGAGCGGTG ACGGCCTACT GGAGTGTCAC TGTCGCTGCA 
25801 ACCTATGCAC CCCGCACCGC TCCCTGGTCT GCAATTCACA ACTGCTTAGC GAAAGTCAAA 
25861 TTATCGGTAC CTTTGAGCTG CAGGGTCCCT CGCCTGACGA AAAGTCCGCG GCTCCGGGGT 
25921 TGAAACTCAC TCCGGGGCTG TGGACGTCGG CTTACCTTCG CAAATTTGTA CCTGAGGACT 
25981 ACCACGCCCA CGAGATTAGG TTCTACGAAG ACCAATCCCG CCCGCCAAAT GCGGAGCTTA 
26041 CCGCCTGCGT CATTACCCAG GGCCACATCC TTGGCCAATT GCAAGCCATT AACAAAGCCC 
26101 GCCAAGAGTT TCTGCTACGA AAGGGACGGG GGGTTTACTT GGACCCCCAG TCCGGCGAGG 
26161 AGCTCAACCC AATCCCCCCG CCGCCGCAGC CCTATCAGCA GCCGCGGGCC CTTGCTTCCC 
26221 AGGATGGCAC CCAAAAAGAA GCTGCAGCTG CCGCCGCCGC CACCCACGGA CGAGGAGGAA 
26281 TACTGGGACA GTCAGGCAGA GGAGGTTTTG GACGAGGAGG AGGAGATGAT GGAAGACTGG 
26341 GACAGCCTAG ACGAGGAAGC TTCCGAGGCC GAAGAGGTGT CAGACGAAAC ACCGTCACCC 
26401 TCGGTCGCAT TCCCCTCGCC GGCGCCCCAG AAATCGGCAA CCGTTCCCAG CATTGCTACA 
26461 ACCTCCGCTC CTCAGGCGCC GCCGGCACTG CCCGTTCGCC GACCCAACCG TAGATGGGAC 
26521 ACCACTGGAA CCAGGGCCGG TAAGTCTAAG CAGCCGCCGC CGTTAGCCCA AGAGCAACAA 
26581 CAGCGCCAAG GCTACCGCTC GTGGCGCGTG CACAAGAACG CCATAGTTGC TTGCTTGCAA 
26641 GACTGTGGGG GCAACATCTC CTTCGCCCGC CGCTTTCTTC TCTACCATCA CGGCGTGGCC 
26701 TTCCCCCGTA ACATCCTGCA TTACTACCGT CATCTCTACA GCCCCTACTG CACCGGCGGC 
26761 AGCGGCAGCA ACAGCAGCGG CCACGCAGAA GCAAAGGCGA CCGGATAGCA AGACTCTGAC 
26821 AAAGCCCAAG AAATCCACAG CGGCGGCAGC AGCAGGAGGA GGAGCACTGC GTCTGGCGCC 
26881 CAACGAACCC GTATCGACCC GCGAGCTTAG AAACAGGATT TTTCCCACTC TGTATGCTAT 
26941 ATTTCAACAG AGCAGGGGCC AAGAACAAGA GCTGAAAATA AAAAACAGGT CTCTGCGCTC 
27001 CCTCACCCGC AGCTGCCTGT ATCACAAAAG CGAAGATCAG CTTCGGCGCA CGCTGGAAGA 
27061 CGCGGAGGCT CTCTTCAGCA AATACTGCGC GCTGACTCTT AAGGACTAGT TTCGCGCCCT 
27121 TTCTCAAATT TAAGCGCGAA AACTACGTCA TCTCCAGCGG CCACACCCGG CGCCAGCACC 
27181 TGTCGTCAGC GCCATTATGA GCAAGGAAAT TCCCACGCCC TACATGTGGA GTTACCAGCC 
27241 ACAAATGGGA CTTGCGGCTG GAGCTGCCCA AGACTACTCA ACCCGAATAA ACTACATGAG 
27301 CGCGGGACCC CACATGATAT CCCGGGTCAA CGGAATCCGC GCCCACCGAA ACCGAATTCT 
27361 CCTCGAACAG GCGGCTATTA CCACCACACC TCGTAATAAC CTTAATCCCC GTAGTTGGCC 
27421 CGCTGCCCTG GTGTACCAGG AAAGTCCCGC TCCCACCACT GTGGTACTTC CCAGAGACGC 
27481 CCAGGCCGAA GTTCAGATGA CTAACTCAGG GGCGCAGCTT GCGGGCGGCT TTCGTCACAG 
27541 GGTGCGGTCG CCCGGGCAGG GTATAACTCA CCTGAAAATC AGAGGGCGAG GTATTCAGCT 
27601 CAACGACGAG TCGGTGAGCT CCTCTCTTGG TCTCCGTCCG GACGGGACAT TTCAGATCGG 
27661 CGGCGCTGGC CGCTCTTCAT TTACGCCCCG TCAGGCGATC CTAACTCTGC AGACCTCGTC 
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27721 CTCGGAGCCG CGCTCCGGAG GCATTGGAAC TCTACAATTT ATTGAGGAGT TCGTGCCTTC 
27781 GGTTTACTTC AACCCCTTTT CTGGACCTCC CGGCCACTAC CCGGACCAGT TTATTCCCAA 
27841 CTTTGACGCG GTAAAAGACT CGGCGGACGG CTACGACTGA ATGACCAGTG GAGAGGCAGA 
27901 GCAACTGCGC CTGACACACC TCGACCACTG CCGCCGCCAC AAGTGCTTTG CCCGCGGCTC 
27961 CGGTGAGTTT TGTTACTTTG AATTGCCCGA AGAGCATATC GAGGGCCCGG CGCACGGCGT 
28021 CCGGCTCACC ACCCAGGTAG AGCTTACACG TAGCCTGATT CGGGAGTTTA CCAAGCGCCC 
28081 CCTGCTAGTG GAGCGGGAGC GGGGTCCCTG TGTTCTGACC GTGGTTTGCA ACTGTCCTAA 
28141 CCCTGGATTA CATCAAGATC TTTGTTGTCA TCTCTGTGCT GAGTATAATA AATACAGAAA 
28201 TTAGAATCTA CTGGGGCTCC TGTCGCCATC CTGTGAACGC CACCGTTTTT ACCCACCCAA 
28261 AGCAGACCAA AGCAAACCTC ACCTCCGGTT TGCACAAGCG GGCCAATAAG TACCTTACCT 
28321 GGTACTTTAA CGGCTCTTCA TTTGTAATTT ACAACAGTTT CCAGCGAGAC GAAGTAAGTT 
28381 TGCCACACAA CCTTCTCGGC TTCAACTACA CCGTCAAGAA AAACACCACC ACCACCCTCC 
28441 TCACCTGCCG GGAACGTACG AGTGCGTCAC CGGTTGCTGC GCCCACACCT ACAGCCTGAG 
28501 CGTAACCAGA CATTACTCCC ATTTTCCCAA AACAGGAGGT GAGCTCAACT CCCGGAACTC 
28561 AGGTCAAAAA AGCATTTTGC GGGGTGCTGG GATTTTTTAA TTAAGTATAT GAGCAATTCA 
28621 AGTAACTCTA CAAGCTTGTC TAATTTTTCT GGAATTGGGG TCGGGGTTAT CCTTACTCTT 
28681 GTAATTCTGT TTATTCTTAT ACTAGCACTT CTGTGCCTTA GGGTTGCCGC CTGCTGCACG 
28741 CACGTTTGTA CCTATTGTCA GCTTTTTAAA CGCTGGGGGC GACATCCAAG ATGAGGTACA 
28801 TGATTTTAGG CTTGCTCGCC CTTGCGGCAG TCTGCAGCGC TGCCAAAAAG GTTGAGTTTA 
28861 AGGAACCAGC TTGCAATGTT ACATTTAAAT CAGAAGCTAA TGAATGCACT ACTCTTATAA 
28921 AATGCACCAC AGAACATGAA AAGCTTATTA TTCGCCACAA AGACAAAATT GGCAAGTATG 
28981 CTGTATATGC TATTTGGCAG CCAGGTGACA CTAACGACTA TAATGTCACA GTCTTCCAAG 
29041 GTGAAAATCG TAAAACTTTT ATGTATAAAT TTCCATTTTA TGAAATGTGC GATATTACCA 
29101 TGTACATGAG CAAACAGTAC AAGTTGTGGC CCCCACAAAA GTGTTTAGAG AACACTGGCA 
29161 CCTTTTGTTC CACCGCTCTG CTTATTACAG CGCTTGCTTT GGTATGTACC TTACTTTATC 
29221 TCAAATACAA AAGCAGACGC AGTTTTATTG ATGAAAAGAA AATGCCTTGA TTTTCCGCTT 
29281 GCTTGTATTC CCCTGGACAA TTTACTCTAT GTGGGATATG CGCCAGGCGG GAAAGATTAT 
29341 ACCCACAACC TTCAAATCAA ACTTTCCTGG ACGTTAGCGC CTGACTTCTG CCAGCGCCTG 
29401 CACTGCAAAT TTGATCAAAC CCAGCTTCAG CTTGCCTGCT CCAGAGATGA CCGGCTCAAC 
29461 CATCGCGCCC ACAACGGACT ATCGCAACAC CACTGCTACC GGACTAAAAT CTGCCCTAAA 
29521 TTTACCCCAA GTTCATGCCT TTGTCAATGA CTGGGCGAGC TTGGGCATGT GGTGGTTTTC 
29581 CATAGCGCTT ATGTTTGTTT GCCTTATTAT TATGTGGCTT ATTTGTTGCC TAAAGCGCAG 
29641 ACGCGCCAGA CCCCCCATCT ATAGGCCTAT CATTGTGCTC AACCCACACA ATGAAAAAAT 
29701 TCATAGATTG GACGGTCTCA AACCATGTTC TCTTCTTTTA CAGTATGATT AAATGAGACA 
29761 TGATTCCTCG AGTCCTTATA TTATTGACCC TTGTTGCGCT TTTCTGTGCG TGCTCTACAT 
29821 TGGCTGCGGT CGCTCACATC GAAGTAGATT GCATCCCACC TTTCACAGTT TACCTGCTTT 
29881 ACGGATTTGT CACCCTTATC CTCATCTGCA GCCTCGTCAC TGTAGTCATC GCCTTCATTC 
29941 AGTTCATTGA CTGGATTTGT GTGCGCATTG CGTACCTTAG GCACCATCCG CAATACAGAG 
30001 ACAGGACTAT AGCTGATCTT CTCAGAATTC TTTAATTATG AAACGGATTG TCACTTTTGT 
30061 TTTGCTGATT TTCTGCGCCC TACCTGTGCT TTGCTCCCAA ACCTCAGCGC CTCCCAAAAG 
30121 ACATATTTCC TGCAGATTCA CTCAAATATG GAACATTCCC AGCTGCTACA ACAAACAGAG 
30181 CGATTTGTCA GAAGCCTGGT TATACGCCAT CATCTCTGTC ATGGTTTTTT GCAGTACCAT 
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30241 TTTTGCCCTA GCCATATACC CATACCTTGA CATTGGTTGG AATGCCATAG ATGCCATGAA 
3 03 01 CCACCCTACT TTCCCAGCGC CCAATGTCAT ACCACTGCAA CAGGTTATTG CCCCAATCAA 
30361 TCAGCCTCGC CCCCCTTCTC CCACCCCCAC TGAGATTAGC TACTTTAATT TGACAGGTGG 
30421 AGATGACTGA ATCTCTAGAT CTAGAATTGG ATGGAATTAA CACCGAACAG CGCCTACTAG 
30481 AAAGGCGCAA GGCGGCGTCC GAGCGAGAAC GCCTAAAACA AGAAGTTGAA GACATGGTTA 
30541 ACCTGCACCA GTGTAAAAGA GGTATCTTTT GTGTGGTCAA GCAGGCCAAA CTTACCTACG 
30601 AAAAAACCAC TACCGGCAAC CGCCTTAGCT ACAAGCTACC CACCCAGCGC CAAAAACTGG 
30661 TGCTTATGGT GGGAGAAAAA CCTATCACCG TCACCCAGCA CTCGGCAGAA ACAGAAGGCT 
30721 GCCTGCACTT CCCCTATCAG GGTCCAGAGG ACCTCTGCAC TCTTATTAAA ACCATGTGTG 
30781 GCATTAGAGA TCTTATTCCA TTCAACTAAC AATAAACACA CAATAAATTA CTTACTTAAA 
30841 ATCAGTCAGC AAATCTTTGT CCAGCTTATT CAGCATCACC TCCTTTCCCT CCTCCCAACT 
30901 CTGGTATTTC AGCAGCCTTT TAGCTGCGAA CTTTCTCCAA AGTCTAAATG GGATGTCAAA 
30961 TTCCTCATGT TCTTGTCCCT CCGCACCCAC TATCTTCATA TTGTTGCAGA TGAAACGCGC 
31021 CAGACCGTCT GAAGACACCT TCAACCCTGT GTACCCATAT GACACGGAAA CCGGCCCTCC 
31081 AACTGTGCCT TTCCTTACCC CTCCCTTTGT GTCGCCAAAT GGGTTCCAAG AAAGTCCCCC 
31141 CGGAGTGCTT TCTTTGCGTC TTTCAGAACC TTTGGTTACC TCACACGGCA TGCTTGCGCT 
31201 AAAAATGGGC AGCGGCCTGT CCCTGGATCA GGCAGGCAAC CTTACATCAA ATACAATCAC 
31261 TGTTTCTCAA CCGCTAAAAA AAACAAAGTC CAATATAACT TTGGAAACAT CCGCGCCCCT 
31321 TACAGTCAGC TCAGGCGCCC TAACCATGGC CACAACTTCG CCTTTGGTGG TCTCTGACAA 
31381 CACTCTTACC ATGCAATCAC AAGCACCGCT AACCGTGCAA GACTCAAAAC TTAGCATTGC 
31441 TACCAAAGAG CCACTTACAG TGTTAGATGG AAAACTGGCC CTGCAGACAT CAGCCCCCCT 
31501 CTCTGCCACT GATAACAACG CCCTCACTAT CACTGCCTCA CCTCCTCTTA CTACTGCAAA 
31561 TGGTAGTCTG GCTGTTACCA TGGAAAACCC ACTTTACAAC AACAATGGAA AACTTGGGCT 
31621 CAAAATTGGC GGTCCTTTGC AAGTGGCCAC CGACTCACAT GCACTAACAC TAGGTACTGG 
31681 TCAGGGGGTT GCAGTTCATA ACAATTTGCT ACATACAAAA GTTACAGGCG CAATAGGGTT 
31741 TGATACATCT GGCAACATGG AACTTAAAAC TGGAGATGGC CTCTATGTGG ATAGCGCCGG 
31801 TCCTAACCAA AAACTACATA TTAATCTAAA TACCACAAAA GGCCTTGCTT TTGACAACAC 
31861 CGCAATAACA ATTAACGCTG GAAAAGGGTT GGAATTTGAA ACAGACTCCT CAAACGGAAA 
31921 TCCCATAAAA ACAAAAATTG GATCAGGCAT ACAATATAAT ACCAATGGAG CTATGGTTGC 
31981 AAAACTTGGA ACAGGCCTCA GTTTTGACAG CTCCGGAGCC ATAACAATGG GCAGCATAAA 
32041 CAATGACAGA CTTACTCTTT GGACAACACC AGACCCATCC CCAAATTGCA GAATTGCTTC 
32101 AGATAAAGAC TGCAAGCTAA CTCTGGCGCT AACAAAATGT GGCAGTCAAA TTTTGGGCAC 
32161 TGTTTCAGCT TTGGCAGTAT CAGGTAATAT GGCCTCCATC AATGGAACTC TAAGCAGTGT 
32221 AAACTTGGTT CTTAGATTTG ATGACAACGG AGTGCTTATG TCAAATTCAT CACTGGACAA 
32281 ACAGTATTGG AACTTTAGAA ACGGGGACTC CACTAACGGT CAACCATACA CTTATGCTGT 
32341 TGGGTTTATG CCAAACCTAA AAGCTTACCC AAAAACTCAA AGTAAAACTG CAAAAAGTAA 
32401 TATTGTTAGC CAGGTGTATC TTAATGGTGA CAAGTCTAAA CCATTGCATT TTACTATTAC 
32461 GCTAAATGGA ACAGATGAAA CCAACCAAGT AAGCAAATAC TCAATATCAT TCAGTTGGTC 
32521 CTGGAACAGT GGACAATACA CTAATGACAA ATTTGCCACC AATTCCTATA CCTTCTCCTA 
32581 CATTGCCCAG GAATAAAGAA TCGTGAACCT GTTGCATGTT ATGTTTCAAC GTGTTTATTT 
32641 TTCAATTGCA GAAAATTTCA AGTCATTTTT CATTCAGTAG TATAGCCCCA CCACCACATA 
32701 GCTTATACTA ATCACCGTAC CTTAATCAAA CTCACAGAAC CCTAGTATTC AACCTGCCAC 
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32761 CTCCCTCCCA ACACACAGAG TACACAGTCC TTTCTCCCCG GCTGGCCTTA AACAGCATCA 
32821 TATCATGGGT AACAGACATA TTCTTAGGTG TTATATTCCA CACGGTCTCC TGTCGAGCCA 
32881 AACGCTCATC AGTGATGTTA ATAAACTCCC CGGGCAGCTC GCTTAAGTTC ATGTCGCTGT 
32941 CCAGCTGCTG AGCCACAGGC TGCTGTCCAA CTTGCGGTTG CTCAACGGGC GGCGAAGGAG 
33001 AAGTCCACGC CTACATGGGG GTAGAGTCAT AATCGTGCAT CAGGATAGGG CGGTGGTGCT 
33061 GCAGCAGCGC GCGAATAAAC TGCTGCCGCC GCCGCTCCGT CCTGCAGGAA TACAACATGG 
33121 CAGTGGTCTC CTCAGCGATG ATTCGCACCG CCCGCAGCAT AAGGCGCCTT GTCCTCCGGG 
33181 CACAGCAGCG CACCCTGATC TCACTTAAGT CAGCACAGTA ACTGCAGCAC AGTACCACAA 
33241 TATTGTTTAA AATCCCACAG TGCAAGGCGC TGTATCCAAA GCTCATGGCG GGGACCACAG 
33301 AACCCACGTG GCCATCATAC CACAAGCGCA GGTAGATTAA GTGGCGACCC CTCATAAACA 
33361 CGCTGGACAT AAACATTACC TCTTTTGGCA TGTTGTAATT CACCACCTCC CGGTACCATA 
33421 TAAACCTCTG ATTAAACATG GCGCCATCCA CCACCATCCT AAACCAGCTG GCCAAAACCT 
33481 GCCCGCCGGC TATGCACTGC AGGGAACCGG GACTGGAACA ATGACAGTGG AGAGCCCAGG 
33541 ACTCGTAACC ATGGATCATC ATGCTCGTCA TGATATCAAT GTTGGCACAA CACAGGCACA 
33601 CGTGCATACA CTTCCTCAGG ATTACAAGCT CCTCCCGCGT CAGAACCATA TCCCAGGGAA 
33661 CAACCCATTC CTGAATCAGC GTAAATCCCA CACTGCAGGG AAGACCTCGC ACGTAACTCA 
33721 CGTTGTGCAT TGTCAAAGTG TTACATTC GG GCAGCAGCGG ATGATCCTCC AGTATGGTAG 
33781 CGCGTGTCTC TGTCTCAAAA GGAGGTAGGC GATCCCTACT GTACGGAGTG CGCCGAGACA 
33841 ACCGAGATCG TGTTGGTCGT AGTGTCATGC CAAATGGAAC GCCGGACGTA GTCATATTTC 
33901 CTGAAGCAAA ACCAGGTGCG GGCGTGACAA ACAGATCTGC GTCTCCGGTC TCGTCGCTTA 
33961 GCTCGCTCTG TGTAGTAGTT GTAGTATATC CACTCTCTCA AAGCATCCAG GCGCCCCCTG 
34021 GCTTCGGGTT CTATGTAAAC TCCTTCATGC GCCGCTGCCC TGATAACATC CACCACCGCA 
34081 GAATAAGCCA CACCCAGCCA ACCTACACAT TCGTTCTGCG AGTCACACAC GGGAGGAGCG 
34141 GGAAGAGCTG GAAGAACCAT GTTTTTTTTT TTTATTCCAA AAGATTATCC AAAACCTCAA 
34201 AATGAAGATC TATTAAGTGA ACGCGCTCCC CTCCGGTGGC GTGGTCAAAC TCTACAGCCA 
34261 AAGAACAGAT AATGGCATTT GTAAGATGTT GCACAATGGC TTCCAAAAGG CAAACTGCCC 
34321 TCACGTCCAA GTGGACGTAA AGGCTAAACC CTTCAGGGTG AATCTCCTCT ATAAACATTC 
34381 CAGCACCTTC AACCATGCCC AAATAATTTT CATCTCGCCA CCTTATCAAT ATGTCTCTAA 
34441 GCAAATCCCG AATATTAAGT CCGGCCATTG TAAAAATCTG CTCCAGAGCG CCCTCCACCT 
34501 TCAGCCTCAA GCAGCGAATC ATGATTGCAA AAATTCAGGT TCCTCACAGA CCTGTATAAG 
345 61 ATTCAAAAGC GG AACATTAA CAAAAATACC GCGATCCCGT AGGTCCCTTC GCAGGGCCAG 
34621 CTGAACATAA TCGTGCAGGT CTGCACGGAC CAGCGCGGCC ACTTCCCCGC CAGGAACCAT 
34681 GACAAAAGAA CCCACACTGA TTATGACACG CATACTCGGA GCTATGCTAA CCAGCGTAGC 
34741 CCCGATGTAA GCTTGTTGCA TGGGCGGCGA TATAAAATGC AAGGTACTGC TCAAAAAATC 
34801 AGGCAAAGCC TCGCGCAAAA AAGCAAGCAC ATCGTAGTCA TGCTCATGCA GATAAAGGCA 
34861 GGTAAGTTCC GGAACCACCA CAGAAAAAGA CACCATTTTT CTCTCAAACA TGTCTGCGGG 
34921 TTCCTGCATA AACACAAAAT AAAATAACAA AAAAAAAAAA ACATTTAAAC ATTAGAAGCC 
34981 TGTNTTACAA CAGGAAAAAC AACCCTTATA AGCATAAGAC GGACTACGGC CATGCCGGCG 
35041 TGACCGTAAA AAAACTGGTC ACCGTGATTA AAAAGCACCA CCGACAGTTC CTCGGTCATG 
35101 TCCGGAGTCA TAATGTAAGA CTCGGTAAAC ACATCAGGTT GGTTAACATC GGTCAGTGCT 
35161 AAAAAGCGAC CGAAATAGCC CGGGGGAATA CATACCCGCA GGCGTAGAGA CAACATTACA 
35221 GCCCCCATAG GAGGTATAAC AAAATTAATA GGAGAGAAAA ACACATAAAC ACCTGAAAAA 
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35281 CCCTCCTGCC TAGGCAAAAT AGCACCCTCC CGCTCCAGAA CAACATACAG CGCTTCCACA 
35341 GCGGCAGCCA TAACAGTCAG CCTTACCAGT AAAAAAACCT ATTAAAAAAC ACCACTCGAC 
35401 ACGGCACCAG CTCAATCAGT CACAGTGTAA AAAGGGCCAA GTACAGAGCG AGTATATATA 
35461 GGACTAAAAA ATGACGTAAC GGTTAAAGTC CACAAAAACC ACCCAGAAAA CCGCACGCGA 
35521 ACCTACGCCC AG AAAC G AAA GCCAAAAAAC CCACAACTTC CTCAAATCTT CACTTCCGTT 
35581 TTCCCACGAT ACGTCACTTC CCATTTTAAA AAAAAACTAC AATTCCCAAT ACATGCAAGT 
35641 TACTCCGCCC TAAAACCTAC GTCACCCGCC CCGTTCCCAC GCCCCGCGCC ACGTCACAAA 
35701 CTCCACCCCC TCATTATCAT ATTGGCTTCA ATCCAAAATA AGGTATATTA TTGATGATG 
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1 CATCATCAAT AATATACCTT ATTTTGGATT GAAGCCAATA TGATAATGAG GGGGTGGAGT 
61 TTGTGACGTG GCGCGGGGCG TGGGAACGGG GCGGGTGACG TAGTAGTGTG GCGGAAGTGT 
121 GATGTTGCAA GTGTGGCGGA ACACATGTAA GCGACGGATG TGGCAAAAGT GACGTTTTTG 
181 GTGTGCGCCG GTGTACACAG GAAGTGACAA TTTTCGCGCG GTTTTAGGCG GATGTTGTAG 
241 TAAATTTGGG CGTAACCGAG TAAGATTTGG CCATTTTCGC GGG AAAACTG AATAAGAGGA 
301 AGTGAAATCT GAATAATTTT GTGTTACTCA TAGCGCGTAA TATTTGTCTA GGGCCGCGGG 
361 GACTTTGACC GTTTACGTGG AGACTCGCCC AGGTGTTTTT CTCAGGTGTT TTCCGCGTTC 
421 CGGGTCAAAG TTGGCGTTTT ATTATTATAG TCAGCTGACG TGTAGTGTAT TTATACCCGG 
481 TGAGTTCCTC AAGAGGCCAC TCTTGAGTGC CAGCGAGTAG AGTTTTCTCC TCCGAGCCGC 
541 TCCGACACCG GGACTGAAAA TGAGACATAT TATCTGCCAC GGAGGTGTTA TTACCGAAGA 
601 AATGGCCGCC AGTCTTTTGG ACCAGCTGAT CGAAGAGGTA CTGGCTGATA ATCTTCCACC 
661 TCCTAGCCAT TTTGAACCAC CTACCCTTCA CGAACTGTAT GATTTAGACG TGACGGCCCC 
721 CGAAGATCCC AACGAGGAGG CGGTTTCGCA GATTTTTCCC GACTCTGTAA TGTTGGCGGT 
781 GCAGGAAGGG ATTGACTTAC TCACTTTTCC GCCGGCGCCC GGTTCTCCGG AGCCGCCTCA 
841 CCTTTCCCGG CAGCCCGAGC AGCCGGAGCA GAGAGCCTTG GGTCCGGTTT CTATGCCAAA 
901 CCTTGTACCG GAGGTGATCG ATCTTACCTG CCACGAGGCT GGCTTTCCAC CCAGTGACGA 
961 CGAGGATGAA GAGGGTGAGG AGTTTGTGTT AGATTATGTG GAGCACCCCG GGCACGGTTG 
1021 CAGGTCTTGT CATTATCACC GGAGGAATAC GGGGGACCCA GATATTATGT GTTCGCTTTG 
1081 CTATATGAGG ACCTGTGGCA TGTTTGTCTA CAGTAAGTGA AAATTATGGG CAGTGGGTGA 
1141 TAGAGTGGTG GGTTTGGTGT GGTAATTTTT TTTTTAATTT TTACAGTTTT GTGGTTTAAA 
1201 GAATTTTGTA TTGTGATTTT TTTAAAAGGT CCTGTGTCTG AACCTGAGCC TGAGCCCGAG 
1261 CCAGAACCGG AGCCTGCAAG ACCTACCCGC CGTCCTAAAA TGGCGCCTGC TATCCTGAGA 
1321 CGCCCGACAT CACCTGTGTC TAGAGAATGC AATAGTAGTA CGGATAGCTG TGACTCCGGT 
1381 CCTTCTAACA CACCTCCTGA GATACACCCG GTGGTCCCGC TGTGCCCCAT TAAACCAGTT 
1441 GCCGTGAGAG TTGGTGGGCG TCGCCAGGCT GTGGAATGTA TCGAGGACTT GCTTAACGAG 
1501 CCTGGGCAAC CTTTGGACTT GAGCTGTAAA CGCCCCAGGC CATAAGGTGT AAACCTGTGA 
1561 TTGCGTGTGT GGTTAACGCC TTTGTTTGCT GAATGAGTTG ATGTAAGTTT AATAAAGGGT 
1621 GAGATAATGT TTAACTTGCA TGGCGTGTTA AATGGGGCGG GGCTTAAAGG GTATATAATG 
1681 CGCCGTGGGC TAATCTTGGT TACATCTGAC CTCATGGAGG CTTGGGAGTG TTTGGAAGAT 
1741 TTTTCTGCTG TGCGTAACTT GCTGGAACAG AGCTCTAACA GTACCTCTTG GTTTTGGAGG 
1801 TTTCTGTGGG GCTCATCCCA GGCAAAGTTA GTCTGCAGAA TTAAGGAGGA TTACAAGTGG 
1861 GAATTTGAAG AGCTTTTGAA ATCCTGTGGT GAGCTGTTTG ATTCTTTGAA TCTGGGTCAC 
1921 CAGGCGCTTT TCCAAGAGAA GGTCATCAAG ACTTTGGATT TTTCCACACC GGGGCGCGCT 
1981 GCGGCTGCTG TTGCTTTTTT GAGTTTTATA AAGGATAAAT GGAGCGAAGA AACCCATCTG 
2041 AGCGGGGGGT ACCTGCTGGA TTTTCTGGCC ATGCATCTGT GGAGAGCGGT TGTGAGACAC 
2101 AAGAATCGCC TGCTACTGTT GTCTTCCGTC CGCCCGGCGA TAATACCGAC GGAGGAGCAG 
2161 CAGCAGCAGC AGGAGGAAGC CAGGCGGCGG CGGCAGGAGC AGAGCCCATG GAACCCGAGA 
2221 GCCGGCCTGG ACCCTCGGGA ATGAATGTTG TACAGGTGGC TGAACTGTAT CCAGAACTGA 
2281 GACGCATTTT GACAATTACA GAGGATGGGC AGGGGCTAAA GGGGGTAAAG AGGGAGCGGG 
2341 GGGCTTGTGA GGCTACAGAG GAGGCTAGGA ATCTAGCTTT TAGCTTAATG ACCAGACACC 
2401 GTCCTGAGTG TATTACTTTT CAACAGATCA AGGATAATTG CGCTAATGAG CTTGATCTGC 
2461 TGGCGCAGAA GTATTCCATA GAGCAGCTGA CCACTTACTG GCTGCAGCCA GGGGATGATT 
2 521 TTGAGGAGGC TATTAGGGTA TATGCAAAGG TGGCACTTAG GCCAGATTGC AAGTACAAGA 
2581 TCAGCAAACT TGTAAATATC AGGAATTGTT GCTACATTTC TGGGAACGGG GCCGAGGTGG 
2641 AGATAGATAC GGAGGATAGG GTGGCCTTTA GATGTAGCAT GATAAATATG TGGCCGGGGG 
2701 TGCTTGGCAT GGACGGGGTG GTTATTATGA ATGTAAGGTT TACTGGCCCC AATTTTAGCG 
2761 GTACGGTTTT CCTGGCCAAT ACCAACCTTA TCCTACACGG TGTAAGCTTC TATGGGTTTA 
2821 ACAATACCTG TGTGGAAGCC TGGACCGATG TAAGGGTTCG GGGCTGTGCC TTTTACTGCT 
2881 GCTGGAAGGG GGTGGTGTGT CGCCCCAAAA GCAGGGCTTC AATTAAGAAA TGCCTCTTTG 
2941 AAAGGTGTAC CTTGGGTATC CTGTCTGAGG GTAACTCCAG GGTGCGCCAC AATGTGGCCT 
3001 CCGACTGTGG TTGCTTCATG CTAGTGAAAA GCGTGGCTGT GATTAAGCAT AACATGGTAT 
3061 GTGGCAACTG CGAGGACAGG GCCTCTCAGA TGCTGACCTG CTCGGACGGC AACTGTCACC 
3121 TGCTGAAGAC CATTCACGTA GCCAGCCACT CTCGCAAGGC CTGGCCAGTG TTTGAGCATA 
3181 ACATACTGAC CCGCTGTTCC TTGCATTTGG GTAACAGGAG GGGGGTG TTC CTACCTTACC 
3241 AATGCAATTT GAGTCACACT AAGATATTGC TTGAGCCCGA GAGCATGTCC AAGGTGAACC 
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3301 TGAACGGGGT GTTTGACATG ACCATGAAGA TCTGGAAGGT GCTGAGGTAC GATGAGACCC 
3361 GCACCAGGTG CAGACCCTGC GAGTGTGGCG GTAAACATAT TAGGAACCAG CCTGTGATGC 
3421 TGGATGTGAC CGAGGAGCTG AGGCCCGATC ACTTGGTGCT GGCCTGCACC CGCGCTGAGT 
3481 TTGGCTCTAG CGATGAAGAT ACAGATTGAG GTACTGAAAT GTGTGGGCGT GGCTTAAGGG 
3541 TGGGAAAGAA TATATAAGGT GGGGGTCTTA TGTAGTTTTG TATCTGTTTT GCAGCAGCCG 
3601 CCGCCGCCAT GAGCACCAAC TCGTTTGATG GAAGCATTGT GAGCTCATAT TTGACAACGC 
3661 GCATGCCCCC ATGGGCCGGG GTGCGTCAGA ATGTGATGGG CTCCAGCATT GATGGTCGCC 
3721 CCGTCCTGCC CGCAAACTCT ACTACCTTGA CCTACGAGAC CGTGTCTGGA ACGCCGTTGG 
3781 AGACTGCAGC CTCCGCCGCC GCTTCAGCCG CTGCAGCCAC CGCCCGCGGG ATTGTGACTG 
3841 ACTTTGCTTT CCTGAGCCCG CTTGCAAGCA GTGCAGCTTC CCGTTCATCC GCCCGCGATG 
3901 ACAAGTTGAC GGCTCTTTTG GCACAATTGG ATTCTTTGAC CCGGGAACTT AATGTCGTTT 
3961 CTCAGCAGCT GTTGGATCTG CGCCAGCAGG TTTCTGCCCT GAAGGCTTCC TCCCCTCCCA 
4021 ATGCGGTTTA AAACATAAAT AAAAAACCAG ACTCTGTTTG GATTTGGATC AAGCAAGTGT 
4081 CTTGCTGTCT TTATTTAGGG GTTTTGCGCG CGCGGTAGGC CCGGGACCAG CGGTCTCGGT 
4141 CGTTGAGGGT CCTGTGTATT TTTTCCAGGA CGTGGTAAAG GTGACTCTGG ATGTTCAGAT 
4201 ACATGGGCAT AAGCCCGTCT CTGGGGTGGA GGTAGCACCA CTGCAGAGCT TCATGCTGCG 
4261 GGGTGGTGTT GTAGATGATC CAGTCGTAGC AGGAGCGCTG GGCGTGGTGC CTAAAAATGT 
4321 CTTTCAGTAG CAAGCTGATT GCCAGGGGCA GGCCCTTGGT GTAAGTGTTT ACAAAGCGGT 
4381 TAAGCTGGGA TGGGTGCATA CGTGGGGATA TGAGATGCAT CTTGGACTGT ATTTTTAGGT 
4441 TGGCTATGTT CCCAGCCATA TCCCTCCGGG GATTCATGTT GTGCAGAACC ACCAGCACAG 
4501 TGTATCCGGT GCACTTGGGA AATTTGTCAT GTAGCTTAGA AGGAAATGCG TGGAAGAACT 
4561 TGGAGACGCC CTTGTGACCT CCAAGATTTT CCATGCATTC GTCCATAATG ATGGCAATGG 
4621 GCCCACGGGC GGCGGCCTGG GCGAAGATAT TTCTGGGATC ACTAACGTCA TAGTTGTGTT 
4681 CCAGGATGAG ATCGTCATAG GCCATTTTTA CAAAGCGCGG GCGGAGGGTG CCAGACTGCG 
4741 GTATAATGGT TCCATCCGGC CCAGGGGCGT AGTTACCCTC ACAGATTTGC ATTTCCCACG 
4801 CTTTGAGTTC AGATGGGGGG ATCATGTCTA CCTGCGGGGC GATGAAGAAA ACGGTTTCCG 
4861 GGGTAGGGGA GATCAGCTGG GAAGAAAGCA GGTTCCTGAG CAGCTGCGAC TTACCGCAGC 
4921 CGGTGGGCCC GTAAATCACA CCTATTACCG GGTGCAACTG GTAGTTAAGA GAGCTGCAGC 
4981 TGCCGTCATC CCTGAGCAGG GGGGCCACTT CGTTAAGCAT GTCCCTGACT CGCATGTTTT 
5041 CCCTGACCAA ATCCGCCAGA AGGCGCTCGC CGCCCAGCGA TAGCAGTTCT TGCAAGGAAG 
5101 CAAAGTTTTT CAACGGTTTG AGACCGTCCG CCGTAGGCAT GCTTTTGAGC GTTTGACCAA 
5161 GCAGTTCCAG GCGGTCCCAC AGCTCGGTCA CCTGCTCTAC GGCATCTCGA TCCAGCATAT 
5221 CTCCTCGTTT CGCGGGTTGG GGCGGCTTTC GCTGTACGGC AGTAGTCGGT GCTCGTCCAG 
5281 ACGGGCCAGG GTCATGTCTT TCCACGGGCG CAGGGTCCTC GTCAGCGTAG TCTGGGTCAC 
5341 GGTGAAGGGG TGCGCTCCGG GCTGCGCGCT GGCCAGGGTG CGCTTGAGGC TGGTCCTGCT 
5401 GGTGCTGAAG CGCTGCCGGT CTTCGCCCTG CGCGTCGGCC AGGTAGCATT TGACCATGGT 
5461 GTCATAGTCC AGCCCCTCCG CGGCGTGGCC CTTGGCGCGC AGCTTGCCCT TGGAGGAGGC 
5521 GCCGCACGAG GGGCAGTGCA GACTTTTGAG GGCGTAGAGC TTGGGCGCGA GAAATACCGA 
5581 TTCCGGGGAG TAGGCATCCG CGCCGCAGGC CCCGCAGACG GTCTCGCATT CCACGAGCCA 
5641 GGTGAGCTCT GGCCGTTCGG GGTCAAAAAC CAGGTTTCCC CCATGCTTTT TGATGCGTTT 
57 01 CTTACCTCTG GTTTCCATGA GCCGGTGTCC ACGCTCGGTG ACGAAAAGGC TGTCCGTGTC 
5761 CCCGTATACA GACTTGAGAG GCCTGTCCTC GAGCGGTGTT CCGCGGTCCT CCTCGTATAG 
5821 AAACTCGGAC CACTCTGAGA CAAAGGCTCG CGTCCAGGCC AGCACGAAGG AGGCTAAGTG 
5881 GGAGGGGTAG CGGTCGTTGT CCACTAGGGG GTCCACTCGC TCCAGGGTGT GAAGACACAT 
5941 GTCGCCCTCT TCGGCATCAA GGAAGGTGAT TGGTTTGTAG GTGTAGGCCA CGTGACCGGG 
6001 TGTTCCTGAA GGGGGGCTAT AAAAGGGGGT GGGGGCGCGT TCGTCCTCAC TCTCTTCCGC 
6061 ATCGCTGTCT GCGAGGGCCA GCTGTTGGGG TGAGTACTCC CTCTGAAAAG CGGGCATGAC 
6121 TTCTGCGCTA AGATTGTCAG TTTCCAAAAA CGAGGAGGAT TTGATATTCA CCTGGCCCGC 
6181 GGTGATGCCT TTGAGGGTGG CCGCATCCAT CTGGTCAGAA AAGACAATCT TTTTGTTGTC 
6241 AAGCTTGGTG GCAAACGACC CGTAGAGGGC GTTGGACAGC AACTTGGCGA TGGAGCGCAG 
6301 GGTTTGGTTT TTGTCGCGAT CGGCGCGCTC CTTGGCCGCG ATGTTTAGCT GCACGTATTC 
6361 GCGCGCAACG CACCGCCATT CGGGAAAGAC GGTGGTGCGC TCGTCGGGCA CCAGGTGCAC 
6421 GCGCCAACCG CGGTTGTGCA GGGTGACAAG GTCAACGCTG GTGGCTACCT CTCCGCGTAG 
6481 GCGCTCGTTG GTCCAGCAGA GGCGGCCGCC CTTGCGCGAG CAGAATGGCG GTAGGGGGTC 
6541 TAGCTGCGTC TCGTCCGGGG GGTCTGCGTC CACGGTAAAG ACCCCGGGCA GCAGGCGCGC 
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6601 GTCGAAGTAG TCTATCTTGC ATCCTTGCAA GTCTAGCGCC TGCTGCCATG CGCGGGCGGC 
6661 AAGCGCGCGC TCGTATGGGT TGAGTGGGGG ACCCCATGGC ATGGGGTGGG TGAGCGCGGA 
6721 GGCGTACATG CCGCAAATGT CGTAAACGTA GAGGGGCTCT CTGAGTATTC CAAGATATGT 
6781 AGGGTAGCAT CTTCCACCGC GGATGCTGGC GCGCACGTAA TCGTATAGTT CGTGCGAGGG 
6841 AGCGAGGAGG TCGGGACCGA GGTTGCTACG GGCGGGCTGC TCTGCTCGGA AGACTATCTG 
6901 CCTGAAGATG GCATGTGAGT TGGATGATAT GGTTGGACGC TGGAAGACGT TGAAGCTGGC 
6961 GTCTGTGAGA CCTACCGCGT CACGCACGAA GGAGGCGTAG GAGTCGCGCA GCTTGTTGAC 
7021 CAGCTCGGCG GTGACCTGCA CGTCTAGGGC GCAGTAGTCC AGGGTTTCCT TGATGATGTC 
7081 ATACTTATCC TGTCCCTTTT TTTTCCACAG CTCGCGGTTG AGGACAAACT CTTCGCGGTC 
7141 TTTCCAGTAC TCTTGGATCG GAAACCCGTC GGCCTCCGAA CGGTAAGAGC CTAGCATGTA 
7201 GAACTGGTTG ACGGCCTGGT AGGCGCAGCA TCCCTTTTCT ACGGGTAGCG CGTATGCCTG 
7261 CGCGGCCTTC CGGAGCGAGG TGTGGGTGAG CGCAAAGGTG TCCCTGACCA TGACTTTGAG 
7321 GTACTGGTAT TTGAAGTCAG TGTCGTCGCA TCCGCCCTGC TCCCAGAGCA AAAAGTCCGT 
7381 GCGCTTTTTG GAACGCGGAT TTGGCAGGGC GAAGGTGACA TCGTTGAAGA GTATCTTTCC 
7441 CGCGCGAGGC ATAAAGTTGC GTGTGATGCG GAAGGGTCCC GGCACCTCGG AACGGTTGTT 
7501 AATTACCTGG GCGGCGAGCA CGATCTCGTC AAAGCCGTTG ATGTTGTGGC CCACAATGTA 
7561 AAGTTCCAAG AAGCGCGGGA TGCCCTTGAT GGAAGGCAAT TTTTTAAGTT CCTCGTAGGT 
7621 GAGCTCTTCA GGGGAGCTGA GCCCGTGCTC TGAAAGGGCC CAGTCTGCAA GATGAGGGTT 
7681 GGAAGCGACG AATGAGCTCC ACAGGTCACG GGCCATTAGC ATTTGCAGGT GGTCGCGAAA 
7741 GGTCCTAAAC TGGCGACCTA TGGCCATTTT TTCTGGGGTG ATGCAGTAGA AGGTAAGCGG 
7801 GTCTTGTTCC CAGCGGTCCC ATCCAAGGTT CGCGGCTAGG TCTCGCGCGG CAGTCACTAG 
7861 AGGCTCATCT CCGCCGAACT TCATGACCAG CATGAAGGGC ACGAGCTGCT TCCCAAAGGC 
7921 CCCCATCCAA GTATAGGTCT CTACATCGTA GGTGACAAAG AGACGCTCGG TGCGAGGATG 
7981 CGAGCCGATC GGGAAGAACT GGATCTCCCG CCACCAATTG GAGGAGTGGC TATTGATGTG 
8041 GTGAAAGTAG AAGTCCCTGC GACGGGCCGA ACACTCGTGC TGGCTTTTGT AAAAACGTGC 
8101 GCAGTACTGG CAGCGGTGCA CGGGCTGTAC ATCCTGCACG AGGTTGACCT GACGACCGCG 
8161 CACAAGGAAG CAGAGTGGGA ATTTGAGCCC CTCGCCTGGC GGGTTTGGCT GGTGGTCTTC 
8221 TACTTCGGCT GCTTGTCCTT GACCGTCTGG CTGCTCGAGG GGAGTTACGG TGGATCGGAC 
8281 CACCACGCCG CGCGAGCCCA AAGTCCAGAT GTCCGCGCGC GGCGGTCGGA GCTTGATGAC 
8341 AACATCGCGC AGATGGGAGC TGTCCATGGT CTGGAGCTCC CGCGGCGTCA GGTCAGGCGG 
8401 GAGCTCCTGC AGGTTTACCT CGCATAGACG GGTCAGGGCG CGGGCTAGAT CCAGGTGATA 
8461 CCTAATTTCC AGGGGCTGGT TGGTGGCGGC GTCGATGGCT TGCAAGAGGC CGCATCCCCG 
8521 CGGCGCGACT ACGGTACCGC GCGGCGGGCG GTGGGCCGCG GGGGTGTCCT TGGATGATGC 
8581 ATCTAAAAGC GGTGACGCGG GCGAGCCCCC GGAGGTAGGG GGGGCTCCGG ACCCGCCGGG 
8641 AGAGGGGGCA GGGGCACGTC GGCGCCGCGC GCGGGCAGGA GCTGGTGCTG CGCGCGTAGG 
8701 TTGCTGGCGA ACGCGACGAC GCGGCGGTTG ATCTCCTGAA TCTGGCGCCT CTGCGTGAAG 
8761 ACGACGGGCC CGGTGAGCTT GAGCCTGAAA GAGAGTTGGA CAGAATCAAT TTCGGTGTCG 
8821 TTGACGGCGG CCTGGCGCAA AATCTCCTGC ACGTCTCCTG AGTTGTCTTG ATAGGCGATC 
8881 TCGGCCATGA ACTGCTCGAT CTCTTCCTCC TGGAGATCTC CGCGTCCGGC TCGCTCCACG 
8941 GTGGCGGCGA GGTCGTTGGA AATGCGGGCC ATGAGCTGCG AGAAGGCGTT GAGGCCTCCC 
9001 TCGTTCCAGA CGCGGCTGTA GACCACGCCC CCTTCGGCAT CGCGGGCGCG CATGACCACC 
9061 TGCGCGAGAT TGAGCTCCAC GTGCCGGGCG AAGACGGCGT AGTTTCGCAG GCGCTGAAAG 
9121 AGGTAGTTGA GGGTGGTGGC GGTGTGTTCT GCCACGAAGA AGTACATAAC CCAGCGTCGC 
9181 AACGTGGATT CGTTGATATC CCCCAAGGCC TCAAGGCGCT CCATGGCCTC GTAGAAGTCC 
9241 ACGGCGAAGT TGAAAAACTG GGAGTTGCGC GCCGACACGG TTAACTCCTC CTCCAGAAGA 
9301 CGGATGAGCT CGGCGACAGT GTCGCGCACC TCGCGCTCAA AGGCTACAGG GGCCTCTTCT 
93 61 TCTTCTTCAA TCTCCTCTTC CATAAGGGCC TCCCCTTCTT CTTCTTCTGG CGGCGGTGGG 
9421 GGAGGGGGGA CACGGCGGCG ACGACGGCGC ACCGGGAGGC GGTCGACAAA GCGCTCGATC 
9481 ATCTCCCCGC GGCGACGGCG CATGGTCTCG GTGACGGCGC GGCCGTTCTC GCGGGGGCGC 
9541 AGTTGGAAGA CGCCGCCCGT CATGTCCCGG TTATGGGTTG GCGGGGGGCT GCCATGCGGC 
9601 AGGGATACGG CGCTAACGAT GCATCTCAAC AATTGTTGTG TAGGTACTCC GCCGCCGAGG 
9661 GACCTGAGCG AGTCCGCATC GACCGGATCG GAAAACCTCT CGAGAAAGGC GTCTAACCAG 
9721 TCACAGTCGC AAGGTAGGCT GAGCACCGTG GCGGGCGGCA GCGGGCGGCG GTCGGGGTTG 
9781 TTTCTGGCGG AGGTGCTGCT GATGATGTAA TTAAAGTAGG CGGTCTTGAG ACGGCGGATG 
9841 GTCGACAGAA GCACCATGTC CTTGGGTCCG GCCTGCTGAA TGCGCAGGCG GTCGGCCATG 
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9901 CCCCAGGCTT CGTTTTGACA TCGGCGCAGG TCTTTGTAGT AGTCTTGCAT GAGCCTTTCT 
9961 ACCGGCACTT CTTCTTCTCC TTCCTCTTGT CCTGCATCTC TTGCATCTAT CGCTGCGGCG 
10021 GCGGCGGAGT TTGGCCGTAG GTGGCGCCCT CTTCCTCCCA TGCGTGTGAC CCCGAAGCCC 
10081 CTCATCGGCT GAAGCAGGGC TAGGTCGGCG ACAACGCGCT CGGCTAATAT GGCCTGCTGC 
10141 ACCTGCGTGA GGGTAGACTG GAAGTCATCC ATGTCCACAA AGCGGTGGTA TGCGCCCGTG 
10201 TTGATGGTGT AAGTGCAGTT GGCCATAACG GACCAGTTAA CGGTCTGGTG ACCCGGCTGC 
10261 GAGAGCTCGG TGTACCTGAG ACGCGAGTAA GCCCTCGAGT CAAATACGTA GTCGTTGCAA 
10321 GTCCGCACCA GGTACTGGTA TCCCACCAAA AAGTGCGGCG GCGGCTGGCG GTAGAGGGGC 
10381 CAGCGTAGGG TGGCCGGGGC TCCGGGGGCG AGATCTTCCA ACATAAGGCG ATGATATCCG 
10441 TAGATGTACC TGGACATCCA GGTGATGCCG GCGGCGGTGG TGGAGGCGCG CGGAAAGTCG 
10501 CGGACGCGGT TCCAGATGTT GCGCAGCGGC AAAAAGTGCT CCATGGTCGG GACGCTCTGG 
10561 CCGGTCAGGC GCGCGCAATC GTTGACGCTC TAGACCGTGC AAAAGGAGAG CCTGTAAGCG 
10621 GGCACTCTTC CGTGGTCTGG TGGATAAATT CGCAAGGGTA TCATGGCGGA CGACCGGGGT 
10681 TCGAGCCCCG TATCCGGCCG TCCGCCGTGA TCCATGCGGT TACCGCCCGC GTGTCGAACC 
10741 CAGGTGTGCG ACGTCAGACA ACGGGGGAGT GCTCCTTTTG GCTTCCTTCC AGGCGCGGCG 
10801 GCTGCTGCGC TAGCTTTTTT GGCCACTGGC CGCGCGCAGC GTAAGCGGTT AGGCTGGAAA 
10861 GCGAAAGCAT TAAGTGGCTC GCTCCCTGTA GCCGGAGGGT TATTTTCCAA GGGTTGAGTC 
10921 GCGGGACCCC CGGTTCGAGT CTCGGACCGG CCGGACTGCG GCGAACGGGG GTTTGCCTCC 
10981 CCGTCATGCA AGACCCCGCT TGCAAATTCC TCCGGAAACA GGGACGAGCC CCTTTTTTGC 
11041 TTTTCCCAGA TGCATCCGGT GCTGCGGCAG ATGCGCCCCC CTCCTCAGCA GCGGCAAGAG 
11101 CAAGAGCAGC GGCAGACATG CAGGGCACCC TCCCCTCCTC CTACCGCGTC AGGAGGGGCG 
11161 ACATCCGCGG TTGACGCGGC AGCAGATGGT GATTACGAAC CCCCGCGGCG CCGGGCCCGG 
11221 CACTACCTGG ACTTGGAGGA GGGCGAGGGC CTGGCGCGGC TAGGAGCGCC CTCTCCTGAG 
11281 CGGTACCCAA GGGTGCAGCT GAAGCGTGAT ACGCGTGAGG CGTACGTGCC GCGGCAGAAC 
11341 CTGTTTCGCG ACCGCGAGGG AGAGGAGCCC GAGGAGATGC GGGATCGAAA GTTCCACGCA 
11401 GGGCGCGAGC TGCGGCATGG CCTGAATCGC GAGCGGTTGC TGCGCGAGGA GGACTTTGAG 
11461 CCCGACGCGC GAACCGGGAT TAGTCCCGCG CGCGCACACG TGGCGGCCGC CGACCTGGTA 
11521 ACCGCATACG AGCAGACGGT GAACCAGGAG ATTAACTTTC AAAAAAGCTT TAACAACCAC 
11581 GTGCGTACGC TTGTGGCGCG CGAGGAGGTG GCTATAGGAC TGATGCATCT GTGGGACTTT 
11641 GTAAGCGCGC TGGAGCAAAA CCCAAATAGC AAGCCGCTCA TGGCGCAGCT GTTCCTTATA 
11701 GTGCAGCACA GCAGGGACAA CGAGGCATTC AGGGATGCGC TGCTAAACAT AGTAGAGCCC 
11761 GAGGGCCGCT GGCTGCTCGA TTTGATAAAC ATCCTGCAGA GCATAGTGGT GCAGGAGCGC 
11821 AGCTTGAGCC TGGCTGACAA GGTGGCCGCC ATCAACTATT CCATGCTTAG CCTGGGCAAG 
11881 TTTTACGCCC GCAAGATATA CCATACCCCT TACGTTCCCA TAGACAAGGA GGTAAAGATC 
11941 GAGGGGTTCT ACATGCGCAT GGCGCTGAAG GTGCTTACCT TGAGCGACGA CCTGGGCGTT 
12001 TATCGCAACG AGCGCATCCA CAAGGCCGTG AGCGTGAGCC GGCGGCGCGA GCTCAGCGAC 
12061 CGCGAGCTGA TGCACAGCCT GCAAAGGGCC CTGGCTGGCA CGGGCAGCGG CGATAGAGAG 
12121 GCCGAGTCCT ACTTTGACGC GGGCGCTGAC CTGCGCTGGG CCCCAAGCCG ACGCGCCCTG 
12181 GAGGCAGCTG GGGCCGGACC TGGGCTGGCG GTGGCACCCG CGCGCGCTGG CAACGTCGGC 
12241 GGCGTGGAGG AATATGACGA GGACGATGAG TACGAGCCAG AGGACGGCGA GTACTAAGCG 
12301 GTGATGTTTC TGATCAGATG ATGCAAGACG CAACGGACCC GGCGGTGCGG GCGGCGCTGC 
123 61 AGAGCCAGCC GTCCGGCCTT AACTCCACGG ACGACTGGCG CCAGGTCATG GACCGCATCA 
12421 TGTCGCTGAC TGCGCGCAAT CCTGACGCGT TCCGGCAGCA GCCGCAGGCC AACCGGCTCT 
12481 CCGCAATTCT GGAAGCGGTG GTCCCGGCGC GCGCAAACCC CACGCACGAG AAGGTGCTGG 
12541 CGATCGTAAA CGCGCTGGCC GAAAACAGGG CCATCCGGCC CGACGAGGCC GGCCTGGTCT 
12601 ACGACGCGCT GCTTCAGCGC GTGGCTCGTT ACAACAGCGG CAACGTGCAG ACCAACCTGG 
12661 ACCGGCTGGT GGGGGATGTG CGCGAGGCCG TGGCGCAGCG TGAGCGCGCG CAGCAGCAGG 
12721 GCAACCTGGG CTCCATGGTT GCACTAAACG CCTTCCTGAG TACACAGCCC GCCAACGTGC 
127 81 CGCGGGGACA GGAGGACTAC ACCAACTTTG TGAGCGCACT GCGGCTAATG GTGACTGAGA 
12841 CACCGCAAAG TGAGGTGTAC CAGTCTGGGC CAGACTATTT TTTCCAGACC AGTAGACAAG 
12901 GCCTGCAGAC CGTAAACCTG AGCCAGGCTT TCAAAAACTT GCAGGGGCTG TGGGGGGTGC 
12961 GGGCTCCCAC AGGCGACCGC GCGACCGTGT CTAGCTTGCT GACGCCCAAC TCGCGCCTGT 
13021 TGCTGCTGCT AATAGCGCCC TTCACGGACA GTGGCAGCGT GTCCCGGGAC ACATACCTAG 
13081 GTCACTTGCT GACACTGTAC CGCGAGGCCA TAGGTCAGGC GCATGTGGAC GAGCATACTT 
13141 TCCAGGAGAT TACAAGTGTC AGCCGCGCGC TGGGGCAGGA GGACACGGGC AGCCTGGAGG 
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132 01 CAACCCTAAA CTACCTGCTG ACCAACCGGC GGCAGAAGAT CCCCTCGTTG CACAGTTTAA 
13261 ACAGCGAGGA GGAGCGCATT TTGCGCTACG TGCAGCAGAG CGTGAGCCTT AACCTGATGC 
13321 GCGACGGGGT AACGCCCAGC GTGGCGCTGG ACATGACCGC GCGCAACATG GAACCGGGCA 

133 81 TGTATGCCTC AAACCGGCCG TTTATCAACC GCCTAATGGA CTACTTGCAT CGCGCGGCCG 
13441 CCGTGAACCC CGAGTATTTC ACCAATGCCA TCTTGAACCC GCACTGGCTA CCGCCCCCTG 
13501 GTTTCTACAC CGGGGGATTC GAGGTGCCCG AGGGTAACGA TGGATTCCTC TGGGACGACA 
13561 TAGACGACAG CGTGTTTTCC CCGCAACCGC AGACCCTGCT AGAGTTGCAA CAGCGCGAGC 
13621 AGGCAGAGGC GGCGCTGCGA AAGGAAAGCT TCCGCAGGCC AAGCAGCTTG TCCGATCTAG 
13681 GCGCTGCGGC CCCGCGGTCA GATGCTAGTA GCCCATTTCC AAGCTTGATA GGGTCTCTTA 
13741 CCAGCACTCG CACCACCCGC CCGCGCCTGC TGGGCGAGGA GGAGTACCTA AACAACTCGC 
13801 TGCTGCAGCC GCAGCGCGAA AAAAACCTGC CTCCGGCATT TCCCAACAAC GGGATAGAGA 
13861 GCCTAGTGGA CAAGATGAGT AGATGGAAGA CGTACGCGCA GGAGCACAGG GACGTGCCAG 
13921 GCCCGCGCCC GCCCACCCGT CGTCAAAGGC ACGACCGTCA GCGGGGTCTG GTGTGGGAGG 
13981 ACGATGACTC GGCAGACGAC AGCAGCGTCC TGGATTTGGG AGGGAGTGGC AACCCGTTTG 
14041 CGCACCTTCG CCCCAGGCTG GGGAGAATGT TTTAAAAAAA AAAAAGCATG ATGCAAAATA 
14101 AAAAACTCAC CAAGGCCATG GCACCGAGCG TTGGTTTTCT TGTATTCCCC TTAGTATGCG 
14161 GCGCGCGGCG ATGTATGAGG AAGGTCCTCC TCCCTCCTAC GAGAGTGTGG TGAGCGCGGC 
14221 GCCAGTGGCG GCGGCGCTGG GTTCTCCCTT CGATGCTCCC CTGGACCCGC CGTTTGTGCC 
14281 TCCGCGGTAC CTGCGGCCTA CCGGGGGGAG AAACAGCATC CGTTACTCTG AGTTGGCACC 
14341 CCTATTCGAC ACCACCCGTG TGTACCTGGT GGACAACAAG TCAACGGATG TGGCATCCCT 
14401 GAACTACCAG AACGACCACA GCAACTTTCT GACCACGGTC ATTCAAAACA ATGACTACAG 
14461 CCGGGGGGAG GCAAGCACAC AGACCATCAA TCTTGACGAC CGGTCGCACT GGGGCGGCGA 
14521 CCTGAAAACC ATCCTGCATA CCAACATGCC AAATGTGAAC GAGTTCATGT TTACCAATAA 
14581 GTTTAAGGCG CGGGTGATGG TGTCGCGCTT GCCTACTAAG GACAATCAGG TGGAGCTGAA 
14641 ATACGAGTGG GTGGAGTTCA CGCTGCCCGA GGGCAACTAC TCCGAGACCA TGACCATAGA 
14701 CCTTATGAAC AACGCGATCG TGGAGCACTA CTTGAAAGTG GGCAGACAGA ACGGGGTTCT 
14761 GGAAAGCGAC ATCGGGGTAA AGTTTGACAC CCGCAACTTC AGACTGGGGT TTGACCCCGT 
14821 CACTGGTCTT GTCATGCCTG GGGTATATAC AAACGAAGCC TTCCATCCAG ACATCATTTT 
14881 GCTGCCAGGA TGCGGGGTGG ACTTCACCCA CAGCCGCCTG AGCAACTTGT TGGGCATCCG 
14941 CAAGCGGCAA CCCTTCCAGG AGGGCTTTAG GATCACCTAC GATGATCTGG AGGGTGGTAA 
15001 CATTCCCGCA CTGTTGGATG TGGACGCCTA CCAGGCGAGC TTGAAAGATG ACACCGAACA 
15061 GGGCGGGGGT GGCGCAGGCG GCAGCAACAG CAGTGGCAGC GGCGCGGAAG AGAACTCCAA 
15121 CGCGGCAGCC GCGGCAATGC AGCCGGTGGA GGACATGAAC GATCATGCCA TTCGCGGCGA 
15181 CACCTTTGCC ACACGGGCTG AGGAGAAGCG CGCTGAGGCC GAAGCAGCGG CCGAAGCTGC 
15241 CGCCCCCGCT GCGCAACCCG AGGTCGAGAA GCCTCAGAAG AAACCGGTGA TCAAACCCCT 
15301 GACAGAGGAC AGCAAGAAAC GCAGTTACAA CCTAATAAGC AATGACAGCA CCTTCACCCA 
15361 GTACCGCAGC TGGTACCTTG CATACAACTA CGGCGACCCT CAGACCGGAA TCCGCTCATG 
15421 GACCCTGCTT TGCACTCCTG ACGTAACCTG CGGCTCGGAG CAGGTCTACT GGTCGTTGCC 
15481 AGACATGATG CAAGACCCCG TGACCTTCCG CTCCACGCGC CAGATCAGCA ACTTTCCGGT 
15541 GGTGGGCGCC GAGCTGTTGC CCGTGCACTC CAAGAGCTTC TACAACGACC AGGCCGTCTA 
15601 CTCCCAACTC ATCCGCCAGT TTACCTCTCT GACCCACGTG TTCAATCGCT TTCCCGAGAA 
15661 CCAGATTTTG GCGCGCCCGC CAGCCCCCAC CATCACCACC GTCAGTGAAA ACGTTCCTGC 
15721 TCTCACAGAT CACGGGACGC TACCGCTGCG CAACAGCATC GG AGGAGTCC AGCGAGTGAC 
15781 CATTACTGAC GCCAGACGCC GCACCTGCCC CTACGTTTAC AAGGCCCTGG GCATAGTCTC 
15841 GCCGCGCGTC CTATCGAGCC GCACTTTTTG AGCAAGCATG TCCATCCTTA TATCGCCCAG 
15901 CAATAACACA GGCTGGGGCC TGCGCTTCCC AAGCAAGATG TTTGGCGGGG CCAAGAAGCG 
15961 CTCCGACCAA CACCCAGTGC GCGTGCGCGG GCACTACCGC GCGCCCTGGG GCGCGCACAA 
16021 ACGCGGCCGC ACTGGGCGCA CCACCGTCGA TGACGCCATC GACGCGGTGG TGGAGGAGGC 
16081 GCGCAACTAC ACGCCCACGC CGCCACCAGT GTCCACAGTG GACGCGGCCA TTCAGACCGT 
16141 GGTGCGCGGA GCCCGGCGCT ATGCTAAAAT GAAGAGACGG CGGAGGCGCG TAGCACGTCG 
16201 CCACCGCCGC CGACCCGGCA CTGCCGCCCA ACGCGCGGCG GCGGCCCTGC TTAACCGCGC 
16261 ACGTCGCACC GGCCGACGGG CGGCCATGCG GGCCGCTCGA AGGCTGGCCG CGGGTATTGT 
16321 CACTGTGCCC CCCAGGTCCA GGCGACGAGC GGCCGCCGCA GCAGCCGCGG CCATTAGTGC 
16381 TATGACTCAG GGTCGCAGGG GCAACGTGTA TTGGGTGCGC GACTCGGTTA GCGGCCTGCG 
16441 CGTGCCCGTG CGCACCCGCC CCCCGCGCAA CTAGATTGCA AGAAAAAACT ACTTAGACTC 
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16501 GTACTGTTGT ATGTATCCAG CGGCGGCGGC GCGCAACGAA GCTATGTCCA AGCGCAAAAT 
16561 CAAAGAAGAG ATGCTCCAGG TCATCGCGCC GGAGATCTAT GGCCCCCCGA AGAAGGAAGA 
16621 GCAGGATTAC AAGCCCCGAA AGCTAAAGCG GGTCAAAAAG AAAAAGAAAG ATGATGATGA 
16681 TGAACTTGAC GACGAGGTGG AACTGCTGCA CGCTACCGCG CCCAGGCGAC GGGTACAGTG 
16741 GAAAGGTCGA CGCGTAAAAC GTGTTTTGCG ACCCGGCACC ACCGTAGTCT TTACGCCCGG 
16801 TGAGCGCTCC ACCCGCACCT ACAAGCGCGT GTATGATGAG GTGTACGGCG ACGAGGACCT 
16861 GCTTGAGCAG GCCAACGAGC GCCTCGGGGA GTTTGCCTAC GGAAAGCGGC ATAAGGACAT 
16921 GCTGGCGTTG CCGCTGGACG AGGGCAACCC AACACCTAGC CTAAAGCCCG TAACACTGCA 
16981 GCAGGTGCTG CCCGCGCTTG CACCGTCCGA AGAAAAGCGC GGCCTAAAGC GCGAGTCTGG 
17041 TGACTTGGCA CCCACCGTGC AGCTGATGGT ACCCAAGCGC CAGCGACTGG AAGATGTCTT 
17101 GGAAAAAATG ACCGTGGAAC CTGGGCTGGA GCCCGAGGTC CGCGTGCGGC CAATCAAGCA 
17161 GGTGGCGCCG GGACTGGGCG TGCAGACCGT GGACGTTCAG ATACCCACTA CCAGTAGCAC 
17221 CAGTATTGCC ACCGCCACAG AGGGCATGGA GACACAAACG TCCCCGGTTG CCTCAGCGGT 
17281 GGCGGATGCC GCGGTGCAGG CGGTCGCTGC GGCCGCGTCC AAGACCTCTA CGGAGGTGCA 
17341 AACGGACCCG TGGATGTTTC GCGTTTCAGC CCCCCGGCGC CCGCGCGGTT CGAGGAAGTA 
17401 CGGCGCCGCC AGCGCGCTAC TGCCCGAATA TGCCCTACAT CCTTCCATTG CGCCTACCCC 
17461 CGGCTATCGT GGCTACACCT ACCGCCCCAG AAGACGAGCA ACTACCCGAC GCCGAACCAC 
17521 CACTGGAACC CGCCGCCGCC GTCGCCGTCG CCAGCCCGTG CTGGCCCCGA TTTCCGTGCG 
17581 CAGGGTGGCT CGCGAAGGAG GCAGGACCCT GGTGCTGCCA ACAGCGCGCT ACCACCCCAG 
17641 CATCGTTTAA AAGCCGGTCT TTGTGGTTCT TGCAGATATG GCCCTCACCT GCCGCCTCCG 
17701 TTTCCCGGTG CCGGGATTCC GAGGAAGAAT GCACCGTAGG AGGGGCATGG CCGGCCACGG 
17761 CCTGACGGGC GGCATGCGTC GTGCGCACCA CCGGCGGCGG CGCGCGTCGC ACCGTCGCAT 
17821 GCGCGGCGGT ATCCTGCCCC TCCTTATTCC ACTGATCGCC GCGGCGATTG GCGCCGTGCC 
17881 CGGAATTGCA TCCGTGGCCT TGCAGGCGCA GAGACACTGA TTAAAAACAA GTTGCATGTG 
17941 GAAAAATCAA AATAAAAAGT CTGGACTCTC ACGCTCGCTT GGTCCTGTAA CTATTTTGTA 
18001 GAATGGAAGA CATCAACTTT GCGTCTCTGG CCCCGCGACA CGGCTCGCGC CCGTTCATGG 
18061 GAAACTGGCA AGATATCGGC ACCAGCAATA TGAGCGGTGG CGCCTTCAGC TGGGGCTCGC 
18121 TGTGGAGCGG CATTAAAAAT TTCGGTTCCA CCGTTAAGAA CTATGGCAGC AAGGCCTGGA 
18181 ACAGCAGCAC AGGCCAGATG CTGAGGGATA AGTTGAAAGA GCAAAATTTC CAACAAAAGG 
18241 TGGTAGATGG CCTGGCCTCT GGCATTAGCG GGGTGGTGGA CCTGGCCAAC CAGGCAGTGC 
18301 AAAATAAGAT TAACAGTAAG CTTGATCCCC GCCCTCCCGT AGAGGAGCCT CCACCGGCCG 
18361 TGGAGACAGT GTCTCCAGAG GGGCGTGGCG AAAAGCGTCC GCGCCCCGAC AGGGAAGAAA 
18421 CTCTGGTGAC GCAAATAGAC GAGCCTCCCT CGTACGAGGA GGCACTAAAG CAAGGCCTGC 
18481 CCACCACCCG TCCCATCGCG CCCATGGCTA CCGGAGTGCT GGGCCAGCAC ACACCCGTAA 
18541 CGCTGGACCT GCCTCCCCCC GCCGACACCC AGCAGAAACC TGTGCTGCCA GGCCCGACCG 
18601 CCGTTGTTGT AACCCGTCCT AGCCGCGCGT CCCTGCGCCG CGCCGCCAGC GGTCCGCGAT 
18661 CGTTGCGGCC CGTAGCCAGT GGCAACTGGC AAAGCACACT GAACAGCATC GTGGGTCTGG 
18721 GGGTGCAATC CCTGAAGCGC CGACGATGCT TCTGAATAGC TAACGTGTCG TATGTGTGTC 
18781 ATGTATGCGT CCATGTCGCC GCCAGAGGAG CTGCTGAGCC GCCGCGCGCC CGCTTTCCAA 
18841 GATGGCTACC CCTTCGATGA TGCCGCAGTG GTCTTACATG CACATCTCGG GCCAGGACGC 
18901 CTCGGAGTAC CTGAGCCCCG GGCTGGTGCA GTTTGCCCGC GCCACCGAGA CGTACTTCAG 
18961 CCTGAATAAC AAGTTTAGAA ACCCCACGGT GGCGCCTACG CACGACGTGA CCACAGACCG 
19021 GTCCCAGCGT TTGACGCTGC GGTTCATCCC TGTGGACCGT GAGGATACTG CGTACTCGTA 
19081 CAAGGCGCGG TTCACCCTAG CTGTGGGTGA TAACCGTGTG CTGGACATGG CTTCCACGTA 
19141 CTTTGACATC CGCGGCGTGC TGGACAGGGG CCCTACTTTT AAGCCCTACT CTGGCACTGC 
19201 CTACAACGCC CTGGCTCCCA AGGGTGCCCC AAATCCTTGC GAATGGGATG AAGCTGCTAC 
19261 TGCTCTTGAA ATAAACCTAG AAGAAGAGGA CGATGACAAC GAAGACGAAG TAGACGAGCA 
19321 AGCTGAGCAG CAAAAAACTC ACGTATTTGG GCAGGCGCCT TATTCTGGTA TAAATATTAC 
19381 AAAGGAGGGT ATTCAAATAG GTGTCGAAGG TCAAACACCT AAATATGCCG ATAAAACATT 
19441 TCAACCTGAA CCTCAAATAG GAGAATCTCA GTGGTACGAA ACTGAAATTA ATCATGCAGC 
19501 TGGGAGAGTC CTTAAAAAGA CTACCCCAAT GAAACCATGT TACGGTTCAT ATGCAAAACC 
19561 CACAAATGAA AATGGAGGGC AAGGCATTCT TGTAAAGCAA CAAAATGGAA AGCTAGAAAG 
19621 TCAAGTGGAA ATGCAATTTT TCTCAACTAC TGAGGCGACC GCAGGCAATG GTGATAACTT 
19681 GACTCCTAAA GTGGTATTGT ACAGTGAAGA TGTAGATATA GAAACCCCAG ACACTCATAT 
19741 TTCTTACATG CCCACTATTA AGGAAGGTAA CTCACGAGAA CTAATGGGCC AACAATCTAT 
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19801 GCCCAACAGG CCTAATTACA TTGCTTTTAG GGACAATTTT ATTGGTCTAA TGTATTACAA 
19861 CAGCACGGGT AATATGGGTG TTCTGGCGGG CCAAGCATCG CAGTTGAATG CTGTTGTAGA 
19921 TTTGCAAGAC AGAAACACAG AGCTTTCATA CCAGCTTTTG CTTGATTCCA TTGGTGATAG 
19981 AACCAGGTAC TTTTCTATGT GGAATCAGGC TGTTGACAGC TATGATCCAG ATGTTAGAAT 
20041 TATTGAAAAT CATGGAACTG AAGATGAACT TCCAAATTAC TGCTTTCCAC TGGGAGGTGT 
20101 GATTAATACA GAGACTCTTA CCAAGGTAAA ACCTAAAACA GGTCAGGAAA ATGGATGGGA 
20161 AAAAGATGCT ACAGAATTTT CAGATAAAAA TGAAATAAGA GTTGGAAATA ATTTTGCCAT 
20221 GGAAATCAAT CTAAATGCCA ACCTGTGGAG AAATTTCCTG TACTCCAACA TAGCGCTGTA 
20281 TTTGCCCGAC AAGCTAAAGT ACAGTCCTTC CAACGTAAAA ATTTCTGATA ACCCAAACAC 
20341 CTACGACTAC ATGAACAAGC GAGTGGTGGC TCCCGGGTTA GTGGACTGCT ACATTAACCT 
20401 TGGAGCACGC TGGTCCCTTG ACTATATGGA CAACGTCAAC CCATTTAACC ACCACCGCAA 
20461 TGCTGGCCTG CGCTACCGCT CAATGTTGCT GGGCAATGGT CGCTATGTGC CCTTCCACAT 
20521 CCAGGTGCCT CAGAAGTTCT TTGCCATTAA AAACCTCCTT CTCCTGCCGG GCTCATACAC 
20581 CTACGAGTGG AACTTCAGGA AGGATGTTAA CATGGTTCTG CAGAGCTCCC TAGGAAATGA 
20641 CCTAAGGGTT GACGGAGCCA GCATTAAGTT TGATAGCATT TGCCTTTACG CCACCTTCTT 
20701 CCCCATGGCC CACAACACCG CCTCCACGCT TGAGGCCATG CTTAGAAACG ACACCAACGA 
20761 CCAGTCCTTT AACGACTATC TCTCCGCCGC CAACATGCTC TACCCTATAC CCGCCAACGC 
20821 TACCAACGTG CCCATATCCA TCCCCTCCCG CAACTGGGCG GCTTTCCGCG GCTGGGCCTT 
20881 CACGCGCCTT AAGACTAAGG AAACCCCATC ACTGGGCTCG GGCTACGACC CTTATTACAC 
20941 CTACTCTGGC TCTATACCCT ACCTAGATGG AACCTTTTAC CTCAACCACA CCTTTAAGAA 
21001 GGTGGCCATT ACCTTTGACT CTTCTGTCAG CTGGCCTGGC AATGACCGCC TGCTTACCCC 
21061 CAACGAGTTT GAAATTAAGC GCTCAGTTGA CGGGGAGGGT TACAACGTTG CCCAGTGTAA 
21121 CATGACCAAA GACTGGTTCC TGGTACAAAT GCTAGCTAAC TACAACATTG GCTACCAGGG 
21181 CTTCTATATC CCAGAGAGCT ACAAGGACCG CATGTACTCC TTCTTTAGAA ACTTCCAGCC 
21241 CATGAGCCGT CAGGTGGTGG ATGATACTAA ATACAAGGAC TACCAACAGG TGGGCATCCT 
21301 ACACCAACAC AACAACTCTG GATTTGTTGG CTACCTTGCC CCCACCATGC GCGAAGGACA 
21361 GGCCTACCCT GCTAACTTCC CCTATCCGCT TATAGGCAAG ACCGCAGTTG ACAGCATTAC 
21421 CCAGAAAAAG TTTCTTTGCG ATCGCACCCT TTGGCGCATC CCATTCTCCA GTAACTTTAT 
21481 GTCCATGGGC GCACTCACAG ACCTGGGCCA AAACCTTCTC TACGCCAACT CCGCCCACGC 
21541 GCTAGACATG ACTTTTGAGG TGGATCCCAT GGACGAGCCC ACCCTTCTTT ATGTTTTGTT 
21601 TGAAGTCTTT GACGTGGTCC GTGTGCACCG GCCGCACCGC GGCGTCATCG AAACCGTGTA 
21661 CCTGCGCACG CCCTTCTCGG CCGCCAACGC CACAACATAA AGAAGCAAGC AACATCAACA 
21721 ACAGCTGCCG CCATGGGCTC CAGTGAGCAG GAACTGAAAG CCATTGTCAA AGATCTTGGT 
21781 TGTGGGCCAT ATTTTTTGGG CACCTATGAC AAGCGCTTTC CAGGCTTTGT TTCTCCACAC 
21841 AAGCTCGCCT GCGCCATAGT CAATACGGCC GGTCGCGAGA CTGGGGGCGT ACACTGGATG 
21901 GCCTTTGCCT GGAACCCGCA CTCAAAAACA TGCTACCTCT TTGAGCCCTT TGGCTTTTCT 
21961 GACCAGCGAC TCAAGCAGGT TTACCAGTTT GAGTACGAGT CACTCCTGCG CCGTAGCGCC 
22021 ATTGCTTCTT CCCCCGACCG CTGTATAACG CTGGAAAAGT CCACCCAAAG CGTACAGGGG 
22081 CCCAACTCGG CCGCCTGTGG ACTATTCTGC TGCATGTTTC TCCACGCCTT TGCCAACTGG 
22141 CCCCAAACTC CCATGGATCA CAACCCCACC ATGAACCTTA TTACCGGGGT ACCCAACTCC 
22201 ATGCTCAACA GTCCCCAGGT ACAGCCCACC CTGCGTCGCA ACCAGGAACA GCTCTACAGC 
22261 TTCCTGGAGC GCCACTCGCC CTACTTCCGC AGCCACAGTG CGCAGATTAG GAGCGCCACT 
22321 TCTTTTTGTC ACTTGAAAAA CATGTAAAAA TAATGTACTA GAGACACTTT CAATAAAGGC 
22381 AAATGCTTTT ATTTGTACAC TCTCGGGTGA TTATTTACCC CCACCCTTGC CGTCTGCGCC 
22441 GTTTAAAAAT CAAAGGGGTT CTGCCGCGCA TCGCTATGCG CCACTGGCAG GGACACGTTG 
22501 CGATACTGGT GTTTAGTGCT CCACTTAAAC TCAGGCACAA CCATCCGCGG CAGCTCGGTG 
22561 AAGTTTTCAC TCCACAGGCT GCGCACCATC ACCAACGCGT TTAGCAGGTC GGGCGCCGAT 
22621 ATCTTGAAGT CGCAGTTGGG GCCTCCGCCC TGCGCGCGCG AGTTGCGATA CACAGGGTTG 
22681 CAGCACTGGA ACACTATCAG CGCCGGGTGG TGCACGCTGG CCAGCACGCT CTTGTCGGAG 
22741 ATCAGATCCG CGTCCAGGTC CTCCGCGTTG CTCAGGGCGA ACGGAGTCAA CTTTGGTAGC 
22801 TGCCTTCCCA AAAAGGGCGC GTGCCCAGGC TTTGAGTTGC ACTCGCACCG TAGTGGCATC 
22861 AAAAGGTGAC CGTGCCCGGT CTGGGCGTTA GGATACAGCG CCTGCATAAA AGCCTTGATC 
22921 TGCTTAAAAG CCACCTGAGC CTTTGCGCCT TCAGAGAAGA ACATGCCGCA AGACTTGCCG 
22981 GAAAACTGAT TGGCCGGACA GGCCGCGTCG TGCACGCAGC ACCTTGCGTC GGTGTTGGAG 
23041 ATCTGCACCA CATTTCGGCC CCACCGGTTC TTCACGATCT TGGCCTTGCT AGACTGCTCC 
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23101 TTCAGCGCGC GCTGCCCGTT TTCGCTCGTC ACATCCATTT CAATCACGTG CTCCTTATTT 
23161 ATCATAATGC TTCCGTGTAG ACACTTAAGC TCGCCTTCGA TCTCAGCGCA GCGGTGCAGC 
23221 CACAACGCGC AGCCCGTGGG CTCGTGATGC TTGTAGGTCA CCTCTGCAAA CGACTGCAGG 
23281 TACGCCTGCA GGAATCGCCC CATCATCGTC ACAAAGGTCT TGTTGCTGGT GAAGGTCAGC 
23341 TGCAACCCGC GGTGCTCCTC GTTCAGCCAG GTCTTGCATA CGGCCGCCAG AGCTTCCACT 
23401 TGGTCAGGCA GTAGTTTGAA GTTCGCCTTT AGATCGTTAT CCACGTGGTA CTTGTCCATC 
23461 AGCGCGCGCG CAGCCTCCAT GCCCTTCTCC CACGCAGACA CGATCGGCAC ACTCAGCGGG 
23521 TTCATCACCG TAATTTCACT TTCCGCTTCG CTGGGCTCTT CCTCTTCCTC TTGCGTCCGC 
23581 ATACCACGCG CCACTGGGTC GTCTTCATTC AGCCGCCGCA CTGTGCGCTT ACCTCCTTTG 
23641 CCATGCTTGA TTAGCACCGG TGGGTTGCTG AAACCCACCA TTTGTAGCGC CACATCTTCT 
23701 CTTTCTTCCT CGCTGTCCAC GATTACCTCT GGTGATGGCG GGCGCTCGGG CTTGGGAGAA 
23761 GGGCGCTTCT TTTTCTTCTT GGGCGCAATG GCCAAATCCG CCGCCGAGGT CGATGGCCGC 
23821 GGGCTGGGTG TGCGCGGCAC CAGCGCGTCT TGTGATGAGT CTTCCTCGTC CTCGGACTCG 
23881 ATACGCCGCC TCATCCGCTT TTTTGGGGGC GCCCGGGGAG GCGGCGGCGA CGGGGACGGG 
23941 GACGACACGT CCTCCATGGT TGGGGGACGT CGCGCCGCAC CGCGTCCGCG CTCGGGGGTG 
24001 GTTTCGCGCT GCTCCTCTTC CCGACTGGCC ATTTCCTTCT CCTATAGGCA GAAAAAGATC 
24061 ATGGAGTCAG TCGAGAAGAA GGACAGCCTA ACCGCCCCCT CTGAGTTCGC CACCACCGCC 
24121 TCCACCGATG CCGCCAACGC GCCTACCACC TTCCCCGTCG AGGCACCCCC GCTTGAGGAG 
24181 GAGGAAGTGA TTATCGAGCA GGACCCAGGT TTTGTAAGCG AAGACGACGA GGACCGCTCA 
24241 GTACCAACAG AGGATAAAAA GCAAGACCAG GACAACGCAG AGGCAAACGA GGAACAAGTC 
24301 GGGCGGGGGG ACGAAAGGCA TGGCGACTAC CTAGATGTGG GAGACGACGT GCTGTTGAAG 
24361 CATCTGCAGC GCCAGTGCGC CATTATCTGC GACGCGTTGC AAGAGCGCAG CGATGTGCCC 
24421 CTCGCCATAG CGGATGTCAG CCTTGCCTAC GAACGCCACC TATTCTCACC GCGCGTACCC 
24481 CCCAAACGCC AAGAAAAC GG CACATGCGAG CCCAACCCGC GCCTCAACTT CTACCCCGTA 
24541 TTTGCCGTGC CAGAGGTGCT TGCCACCTAT CACATCTTTT TCCAAAACTG CAAGATACCC 
24601 CTATCCTGCC GTGCCAACCG CAGCCGAGCG GACAAGCAGC TGGCCTTGCG GCAGGGCGCT 
24661 GTCATACCTG ATATCGCCTC GCTCAACGAA GTGCCAAAAA TCTTTGAGGG TCTTGGACGC 
24721 GACGAGAAGC GCGCGGCAAA CGCTCTGCAA CAGGAAAACA GCGAAAATGA AAGTCACTCT 
24781 GGAGTGTTGG TGGAACTCGA GGGTGACAAC GCGCGCCTAG CCGTACTAAA ACGCAGCATC 
24841 GAGGTCACCC ACTTTGCCTA CCCGGCACTT AACCTACCCC CCAAGGTCAT GAGCACAGTC 
24901 ATGAGTGAGC TGATCGTGCG CCGTGCGCAG CCCCTGGAGA GGGATGCAAA TTTGCAAGAA 
24961 CAAACAGAGG AGGGCCTACC CGCAGTTGGC GACGAGCAGC TAGCGCGCTG GCTTCAAACG 
25021 CGCGAGCCTG CCGACTTGGA GGAGCGACGC AAACTAATGA TGGCCGCAGT GCTCGTTACC 
25081 GTGGAGCTTG AGTGCATGCA GCGGTTCTTT GCTGACCCGG AGATGCAGCG CAAGCTAGAG 
25141 GAAACATTGC ACTACACCTT TCGACAGGGC TACGTACGCC AGGCCTGCAA GATCTCCAAC 
25201 GTGGAGCTCT GCAACCTGGT CTCCTACCTT GGAATTTTGC ACGAAAACCG CCTTGGGCAA 
25261 AACGTGCTTC ATTCCACGCT CAAGGGCGAG GCGCGCCGCG ACTACGTCCG CGACTGCGTT 
25321 TACTTATTTC TATGCTACAC CTGGCAGACG GCCATGGGCG TTTGGCAGCA GTGCTTGGAG 
25381 GAGTGCAACC TCAAGGAGCT GCAGAAACTG CTAAAGCAAA ACTTGAAGGA CCTATGGACG 
25441 GCCTTCAACG AGCGCTCCGT GGCCGCGCAC CTGGCGGACA TCATTTTCCC CGAACGCCTG 
25501 CTTAAAACCC TGCAACAGGG TCTGCCAGAC TTCACCAGTC AAAGCATGTT GCAGAACTTT 
25561 AGGAACTTTA TCCTAGAGCG CTCAGGAATC TTGCCCGCCA CCTGCTGTGC ACTTCCTAGC 
25621 GACTTTGTGC CCATTAAGTA CCGCGAATGC CCTCCGCCGC TTTGGGGCCA CTGCTACCTT 
25681 CTGCAGCTAG CCAACTACCT TGCCTACCAC TCTGACATAA TGGAAGACGT GAGCGGTGAC 
25741 GGTCTACTGG AGTGTCACTG TCGCTGCAAC CTATGCACCC CGCACCGCTC CCTGGTTTGC 
25801 AATTCGCAGC TGCTTAACGA AAGTCAAATT ATCGGTACCT TTGAGCTGCA GGGTCCCTCG 
25861 CCTGACGAAA AGTCCGCGGC TCCGGGGTTG AAACTCACTC CGGGGCTGTG GACGTCGGCT 
25921 TACCTTCGCA AATTTGTACC TGAGGACTAC CACGCCCACG AGATTAGGTT CTACGAAGAC 
25981 CAATCCCGCC CGCCAAATGC GGAGCTTACC GCCTGCGTCA TTACCCAGGG CCACATTCTT 
26041 GGCCAATTGC AAGCCATCAA CAAAGCCCGC CAAGAGTTTC TGCTACGAAA GGGACGGGGG 
26101 GTTTACTTGG ACCCCCAGTC CGGCGAGGAG CTCAACCCAA TCCCCCCGCC GCCGCAGCCC 
26161 TATCAGCAGC AGCCGCGGGC CCTTGCTTCC CAGGATGGCA CCCAAAAAGA AGCTGCAGCT 
26221 GCCGCCGCCA CCCACGGACG AGGAGGAATA CTGGGACAGT CAGGCAGAGG AGGTTTTGGA 
26281 CGAGGAGGAG GAGGACATGA TGGAAGACTG GGAGAGCCTA GACGAGGAAG CTTCCGAGGT 
2 6341 CGAAGAGGTG TCAGACGAAA CACCGTCACC CTCGGTCGCA TTCCCCTCGC CGGCGCCCCA 
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26401 GAAATCGGCA ACCGGTTCCA GCATGGCTAC AACCTCCGCT CCTCAGGCGC CGCCGGCACT 
26461 GCCCGTTCGC CGACCCAACC GTAGATGGGA CACCACTGGA ACCAGGGCCG GTAAGTCCAA 
26521 GCAGCCGCCG CCGTTAGCCC AAGAGCAACA ACAGCGCCAA GGCTACCGCT CATGGCGCGG 
26581 GCACAAGAAC GCCATAGTTG CTTGCTTGCA AGACTGTGGG GGCAACATCT CCTTCGCCCG 
26641 CCGCTTTCTT CTCTACCATC ACGGCGTGGC CTTCCCCCGT AACATCCTGC ATTACTACCG 
26701 TCATCTCTAC AGCCCATACT GCACCGGCGG CAGCGGCAGC GGCAGCAACA GCAGCGGCCA 
26761 CACAGAAGCA AAGGCGACCG GATAGCAAGA CTCTGACAAA GCCCAAGAAA TCCACAGCGG 
26821 CGGCAGCAGC AGGAGGAGGA GCGCTGCGTC TGGCGCCCAA CGAACCCGTA TCGACCCGCG 
26881 AGCTTAGAAA CAGGATTTTT CCCACTCTGT ATGCTATATT TCAACAGAGC AGGGGCCAAG 
26941 AACAAGAGCT GAAAATAAAA AACAGGTCTC TGCGATCCCT CACCCGCAGC TGCCTGTATC 
27001 ACAAAAGCGA AGATCAGCTT CGGCGCACGC TGGAAGACGC GGAGGCTCTC TTCAGTAAAT 
27061 ACTGCGCGCT GACTCTTAAG GACTAGTTTC GCGCCCTTTC TCAAATTTAA GCGCGAAAAC 
27121 TACGTCATCT CCAGCGGCCA CACCCGGCGC CAGCACCTGT CGTCAGCGCC ATTATGAGCA 
27181 AGGAAATTCC CACGCCCTAC ATGTGGAGTT ACCAGCCACA AATGGGACTT GCGGCTGGAG 
27241 CTGCCCAAGA CTACTCAACC CGAATAAACT ACATGAGCGC GGGACCCCAC ATGATATCCC 
27301 GGGTCAACGG AATCCGCGCC CACCGAAACC GAATTCTCTT GGAACAGGCG GCTATTACCA 
27361 CCACACCTCG TAATAACCTT AATCCCCGTA GTTGGCCCGC TGCCCTGGTG TACCAGGAAA 
27421 GTCCCGCTCC CACCACTGTG GTACTTCCCA GAGACGCCCA GGCCGAAGTT CAGATGACTA 
27481 ACTCAGGGGC GCAGCTTGCG GGCGGCTTTC GTCACAGGGT GCGGTCGCCC GGGCAGGGTA 
27541 TAACTCACCT GACAATCAGA GGGCGAGGTA TTCAGCTCAA CGACGAGTCG GTGAGCTCCT 
27601 CGCTTGGTCT CCGTCCGGAC GGGACATTTC AGATCGGCGG CGCCGGCCGT CCTTCATTCA 
27661 CGCCTCGTCA GGCAATCCTA ACTCTGCAGA CCTCGTCCTC TGAGCCGCGC TCTGGAGGCA 
27721 TTGGAACTCT GCAATTTATT GAGGAGTTTG TGCCATCGGT CTACTTTAAC CCCTTCTCGG 
27781 GACCTCCCGG CCACTATCCG GATCAATTTA TTCCTAACTT TGACGCGGTA AAGGACTCGG 
27841 CGGACGGCTA CGACTGAATG TTAAGTGGAG AGGCAGAGCA ACTGCGCCTG AAACACCTGG 
27 901 TCCACTGTCG CCGCCACAAG TGCTTTGCCC GCGACTCCGG TGAGTTTTGC TACTTTGAAT 
27961 TGCCCGAGGA TCATATCGAG GGCCCGGCGC ACGGCGTCCG GCTTACCGCC CAGGGAGAGC 
- 28021 TTGCCCGTAG CCTGATTCGG GAGTTTACCC AGCGCCCCCT GCTAGTTGAG CGGGACAGGG 
28081 GACCCTGTGT TCTCACTGTG ATTTGCAACT GTCCTAACCT TGGATTACAT CAAGATCTTT 
28141 GTTGCCATCT CTGTGCTGAG TATAATAAAT ACAGAAATTA AAATATACTG GGGCTCCTAT 
28201 CGCCATCCTG TAAACGCCAC CGTCTTCACC CGCCCAAGCA AACCAAGGCG AACCTTACCT 
28261 GGTACTTTTA ACATCTCTCC CTCTGTGATT TACAACAGTT TCAACCCAGA CGGAGTGAGT 
28321 CTACGAGAGA ACCTCTCCGA GCTCAGCTAC TCCATCAGAA AAAACACCAC CCTCCTTACC 
28381 TGCCGGGAAC GTACGAGTGC GTCACCGGCC GCTGCACCAC ACCTACCGCC TGACCGTAAA 
28441 CCAGACTTTT TCCGGACAGA CCTCAATAAC TCTGTTTACC AGAACAGGAG GTGAGCTTAG 
28501 AAAACCCTTA GGGTATTAGG CCAAAGGCGC AGCTACTGTG GGGTTTATGA ACAATTCAAG 
28561 CAACTCTACG GGCTATTCTA ATTCAGGTTT CTCTAGAATC GGGGTTGGGG TTATTCTCTG 
28621 TCTTGTGATT CTCTTTATTC TTATACTAAC GCTTCTCTGC CTAAGGCTCG CCGCCTGCTG 
28681 TGTGCACATT TGCATTTATT GTCAGCTTTT TAAACGCTGG GGTCGCCACC CAAGATGATT 
28741 AGGTACATAA TCCTAGGTTT ACTCACCCTT GCGTCAGCCC ACGGTACCAC CCAAAAGGTG 
28801 GATTTTAAGG AGCCAGCCTG TAATGTTACA TTCGCAGCTG AAGCTAATGA GTGCACCACT 
28861 CTTATAAAAT GCACCACAGA ACATGAAAAG CTGCTTATTC GCCACAAAAA CAAAATTGGC 
28921 AAGTATGCTG TTTATGCTAT TTGGCAGCCA GGTGACACTA CAGAGTATAA TGTTACAGTT 
28981 TTCCAGGGTA AAAGTCATAA AACTTTTATG TATACTTTTC CATTTTATGA AATGTGCGAC 
29041 ATTACCATGT ACATGAGCAA ACAGTATAAG TTGTGGCCCC CACAAAATTG TGTGGAAAAC 
29101 ACTGGCACTT TCTGCTGCAC TGCTATGCTA ATTACAGTGC TCGCTTTGGT CTGTACCCTA 
29161 CTCTATATTA AATACAAAAG CAGACGCAGC TTTATTGAGG AAAAGAAAAT GCCTTAATTT 
29221 ACTAAGTTAC AAAGCTAATG TCACCACTAA CTGCTTTACT CGCTGCTTGC AAAACAAATT 
29281 CAAAAAGTTA GCATTATAAT TAGAATAGGA TTTAAACCCC CCGGTCATTT CCTGCTCAAT 
29341 ACCATTCCCC TGAACAATTG ACTCTATGTG GGATATGCTC CAGCGCTACA ACCTTGAAGT 
29401 CAGGCTTCCT GGATGTCAGC ATCTGACTTT GGCCAGCACC TGTCCCGCGG ATTTGTTCCA 
29461 GTCCAACTAC AGCGACCCAC CCTAACAGAG ATGACCAACA CAACCAACGC GGCCGCCGCT 
29521 ACCGGACTTA CATCTACCAC AAATACACCC CAAGTTTCTG CCTTTGTCAA TAACTGGGAT 
29581 AACTTGGGCA TGTGGTGGTT CTCCATAGCG CTTATGTTTG TATGCCTTAT TATTATGTGG 
29641 CTCATCTGCT GCCTAAAGCG CAAACGCGCC CGACCACCCA TCTATAGTCC CATCATTGTG 
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29701 CTACACCCAA ACAATGATGG AATCCATAGA TTGGACGGAC TGAAACACAT GTTCTTTTCT 
29761 CTTACAGTAT GATTAAATGA GACATGATTC CTCGAGTTTT TATATTACTG ACCCTTGTTG 
29821 CGCTTTTTTG TGCGTGCTCC ACATTGGCTG CGGTTTCTCA CATCGAAGTA GACTGCATTC 
29881 CAGCCTTCAC AGTCTATTTG CTTTACGGAT TTGTCACCCT CACGCTCATC TGCAGCCTCA 
29941 TCACTGTGGT CATCGCCTTT ATCCAGTGCA TTGACTGGGT CTGTGTGCGC TTTGCATATC 
30001 TCAGACACCA TCCCCAGTAC AGGGACAGGA CTATAGCTGA GCTTCTTAGA ATTCTTTAAT 
30061 TATGAAATTT ACTGTGACTT TTCTGCTGAT TATTTGCACC CTATCTGCGT TTTGTTCCCC 
30121 GACCTCCAAG CCTCAAAGAC ATATATCATG CAGATTCACT CGTATATGGA ATATTCCAAG 
30181 TTGCTACAAT GAAAAAAGCG ATCTTTCCGA AGCCTGGTTA TATGCAATCA TCTCTGTTAT 
30241 GGTGTTCTGC AGTACCATCT TAGCCCTAGC TATATATCCC TACCTTGACA TTGGCTGGAA 
30301 ACGAATAGAT GCCATGAACC ACCCAACTTT CCCCGCGCCC GCTATGCTTC CACTGCAACA 
30361 AGTTGTTGCC GGCGGCTTTG TCCCAGCCAA TCAGCCTCGC CCCACTTCTC CCACCCCCAC 
30421 TGAAATCAGC TACTTTAATC TAACAGGAGG AGATGACTGA CACCCTAGAT CTAGAAATGG 
30481 ACGGAATTAT TACAGAGCAG CGCCTGCTAG AAAGACGCAG GGCAGCGGCC GAGCAACAGC 
30541 GCATGAATCA AGAGCTCCAA GACATGGTTA ACTTGCACCA GTGCAAAAGG GGTATCTTTT 
30601 GTCTGGTAAA GCAGGCCAAA GTCACCTACG ACAGTAATAC CACCGGACAC CGCCTTAGCT 
30661 ACAAGTTGCC AACCAAGCGT CAGAAATTGG TGGTCATGGT GGGAGAAAAG CCCATTACCA 
30721 TAACTCAGCA CTCGGTAGAA ACCGAAGGCT GCATTCACTC ACCTTGTCAA GGACCTGAGG 
30781 ATCTCTGCAC CCTTATTAAG ACCCTGTGCG GTCTCAAAGA TCTTATTCCC TTTAACTAAT 
30841 AAAAAAAAAT AATAAAGCAT CACTTACTTA AAATCAGTTA GCAAATTTCT GTCCAGTTTA 
30901 TTCAGCAGCA CCTCCTTGCC CTCCTCCCAG CTCTGGTATT GCAGCTTCCT CCTGGCTGCA 
30961 AACTTTCTCC ACAATCTAAA TGGAATGTCA GTTTCCTCCT GTTCCTGTCC ATCCGCACCC 
31021 ACTATCTTCA TGTTGTTGCA GATGAAGCGC GCAAGACCGT CTGAAGATAC CTTCAACCCC 
31081 GTGTATCCAT ATGACACGGA AACCGGTCCT CCAACTGTGC CTTTTCTTAC TCCTCCCTTT 
31141 GTATCCCCCA ATGGGTTTCA AGAGAGTCCC CCTGGGGTAC TCTCTTTGCG CCTATCCGAA 
31201 CCTCTAGTTA CCTCCAATGG CATGCTTGCG CTCAAAATGG GCAACGGCCT CTCTCTGGAC 
31261 GAGGCCGGCA ACCTTACCTC CCAAAATGTA ACCACTGTGA GCCCACCTCT CAAAAAAACC 
31321 AAGTCAAACA TAAACCTGGA AATATCTGCA CCCCTCACAG TTACCTCAGA AGCCCTAACT 
31381 GTGGCTGCCG CCGCACCTCT AATGGTCGCG GGCAACACAC TCACCATGCA ATCACAGGCC 
31441 CCGCTAACCG TGCACGACTC CAAACTTAGC ATTGCCACCC AAGGACCCCT CACAGTGTCA 
31501 GAAGGAAAGC TAGCCCTGCA AACATCAGGC CCCCTCACCA CCACCGATAG CAGTACCCTT 
31561 ACTATCACTG CCTCACCCCC TCTAACTACT GCCACTGGTA GCTTGGGCAT TGACTTGAAA 
31621 GAGCCCATTT ATACACAAAA TGGAAAACTA GGACTAAAGT ACGGGGCTCC TTTGCATGTA 
31681 ACAGACGACC TAAACACTTT GACCGTAGCA ACTGGTCCAG GTGTGACTAT TAATAATACT 
31741 TCCTTGCAAA CTAAAGTTAC TGGAGCCTTG GGTTTTGATT CACAAGGCAA TATGCAACTT 
31801 AATGTAGCAG GAGGACTAAG GATTGATTCT CAAAACAGAC GCCTTATACT TGATGTTAGT 
31861 TATCCGTTTG ATGCTCAAAA CCAACTAAAT CTAAGACTAG GACAGGGCCC TCTTTTTATA 
31921 AACTCAGCCC ACAACTTGGA TATTAACTAC AACAAAGGCC TTTACTTGTT TACAGCTTCA 
31981 AACAATTCCA AAAAGCTTGA GGTTAACCTA AGCACTGCCA AGGGGTTGAT GTTTGACGCT 
32041 ACAGCCATAG CCATTAATGC AGGAGATGGG CTTGAATTTG GTTCACCTAA TGCACCAAAC 
32101 ACAAATCCCC TCAAAACAAA AATTGGCCAT GGCCTAGAAT TTGATTCAAA CAAGGCTATG 
32161 GTTCCTAAAC TAGGAACTGG CCTTAGTTTT GACAGCACAG GTGCCATTAC AGTAGGAAAC 
32221 AAAAATAATG ATAAGCTAAC TTTGTGGACC ACACCAGCTC CATCTCCTAA CTGTAGACTA 
32281 AATGCAGAGA AAGATGCTAA ACTCACTTTG GTCTTAACAA AATGTGGCAG TCAAATACTT 
32341 GCTACAGTTT CAGTTTTGGC TGTTAAAGGC AGTTTGGCTC CAATATCTGG AACAGTTCAA 
32401 AGTGCTCATC TTATTATAAG ATTTGACGAA AATGGAGTGC TACTAAACAA TTCCTTCCTG 
32461 GACCCAGAAT ATTGGAACTT TAGAAATGGA GATCTTACTG AAGGCACAGC CTATACAAAC 
32521 GCTGTTGGAT TTATGCCTAA CCTATCAGCT TATCCAAAAT CTCACGGTAA AACTGCCAAA 
32581 AGTAACATTG TCAGTCAAGT TTACTTAAAC GGAGACAAAA CTAAACCTGT AACACTAACC 
32641 ATTACACTAA ACGGTACACA GGAAACAGGA GACACAACTC CAAGTGCATA CTCTATGTCA 
32701 TTTTCATGGG ACTGGTCTGG CCACAACTAC ATTAATGAAA TATTTGCCAC ATCCTCTTAC 
32761 ACTTTTTCAT ACATTGCCCA AGAATAAAGA ATCGTTTGTG TTATGTTTCA ACGTGTTTAT 
32821 TTTTCAATTG CAGAAAATTT CAAGTCATTT TTCATTCAGT AGTATAGCCC CACCACCACA 
32881 TAGCTTATAC AGATCACCGT ACCTTAATCA AACTCACAGA ACCCTAGTAT TCAACCTGCC 
32941 ACCTCCCTCC CAACACACAG AGTACACAGT CCTTTCTCCC CGGCTGGCCT TAAAAAGCAT 
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33001 CATATCATGG GTAACAGACA TATTCTTAGG TGTTATATTC CACACGGTTT CCTGTCGAGC 
33061 CAAACGCTCA TCAGTGATAT TAATAAACTC CCCGGGCAGC TCACTTAAGT TCATGTCGCT 
33121 GTCCAGCTGC TGAGCCACAG GCTGCTGTCC AACTTGCGGT TGCTTAACGG GCGGCGAAGG 
33181 AGAAGTCCAC GCCTACATGG GGGTAGAGTC ATAATCGTGC ATCAGGATAG GGCGGTGGTG 
33241 CTGCAGCAGC GCGCGAATAA ACTGCTGCCG CCGCCGCTCC GTCCTGCAGG AATACAACAT 
33301 GGCAGTGGTC TCCTCAGCGA TGATTCGCAC CGCCCGCAGC ATAAGGCGCC TTGTCCTCCG 
33361 GGCACAGCAG CGCACCCTGA TCTCACTTAA ATCAGCACAG TAACTGCAGC ACAGCACCAC 
33421 AATATTGTTC AAAATCCCAC AGTGCAAGGC GCTGTATCCA AAGCTCATGG CGGGGACCAC 
33481 AGAACCCACG TGGCCATCAT ACCACAAGCG CAGGTAGATT AAGTGGCGAC CCCTCATAAA 
33541 CACGCTGGAC ATAAACATTA CCTCTTTTGG CATGTTGTAA TTCACCACCT CCCGGTACCA 
33601 TATAAACCTC TGATTAAACA TGGCGCCATC CACCACCATC CTAAACCAGC TGGCCAAAAC 
33 661 CTGCCCGCCG GCTATACACT GCAGGGAACC GGGACTGGAA CAATGACAGT GGAGAGCCCA 
33721 GGACTCGTAA CCATGGATCA TCATGCTCGT CATGATATCA ATGTTGGCAC AACACAGGCA 
33781 CACGTGCATA CACTTCCTCA GGATTACAAG CTCCTCCCGC GTTAGAACCA TATCCCAGGG 
33841 AACAACCCAT TCCTGAATCA GCGTAAATCC CACACTGCAG GGAAGACCTC GCACGTAACT 
33901 CACGTTGTGC ATTGTCAAAG TGTTACATTC GGGCAGCAGC GGATGATCCT CCAGTATGGT 
33961 AGCGCGGGTT TCTGTCTCAA AAGGAGGTAG ACGATCCCTA CTGTACGGAG TGCGCCGAGA 
34021 CAACCGAGAT CGTGTTGGTC GTAGTGTCAT GCCAAATGGA ACGCCGGACG TAGTCATATT 
34081 TCCTGAAGCA AAACCAGGTG CGGGCGTGAC AAACAGATCT GCGTCTCCGG TCTCGCCGCT 
34141 TAGATCGCTC TGTGTAGTAG TTGTAGTATA TCCACTCTCT CAAAGCATCC AGGCGCCCCC 
34201 TGGCTTCGGG TTCTATGTAA ACTCCTTCAT GCGCCGCTGC CCTGATAACA TCCACCACCG 
34261 CAGAATAAGC CACACCCAGC CAACCTACAC ATTCGTTCTG CGAGTCACAC ACGGGAGGAG 
34321 CGGGAAGAGC TGGAAGAACC ATGTTTTTTT TTTTATTCCA AAAGATTATC CAAAACCTCA 
34381 AAATGAAGAT CTATTAAGTG AACGCGCTCC CCTCCGGTGG CGTGGTCAAA CTCTACAGCC 
34441 AAAGAACAGA TAATGGCATT TGTAAGATGT TGCACAATGG CTTCCAAAAG GCAAACGGCC 
34501 CTCACGTCCA AGTGGACGTA AAGGCTAAAC CCTTCAGGGT GAATCTCCTC TATAAACATT 
34561 CCAGCACCTT CAACCATGCC CAAATAATTC TCATCTCGCC ACCTTCTCAA TATATCTCTA 
34621 AGCAAATCCC GAATATTAAG TCCGGCCATT GTAAAAATCT GCTCCAGAGC GCCCTCCACC 
34681 TTCAGCCTCA AGCAGCGAAT CATGATTGCA AAAATTCAGG TTCCTCACAG ACCTGTATAA 
34741 GATTCAAAAG CGGAACATTA ACAAAAATAC CGCGATCCCG TAGGTCCCTT CGCAGGGCCA 
34801 GCTGAACATA ATCGTGCAGG TCTGCACGGA CCAGCGCGGC CACTTCCCCG CCAGGAACCT 
34861 TGACAAAAGA ACCCACACTG ATTATGACAC GCATACTCGG AGCTATGCTA ACCAGCGTAG 
34921 CCCCGATGTA AGCTTTGTTG CATGGGCGGC GATATAAAAT GCAAGGTGCT GCTCAAAAAA 
34981 TCAGGCAAAG CCTCGCGCAA AAAAGAAAGC ACATCGTAGT CATGCTCATG CAGATAAAGG 
35041 CAGGTAAGCT CCGGAACCAC CACAGAAAAA GACACCATTT TTCTCTCAAA CATGTCTGCG 
35101 GGTTTCTGCA TAAACACAAA ATAAAATAAC AAAAAAACAT TTAAACATTA GAAGCCTGTC 
35161 TTACAACAGG AAAAACAACC CTTATAAGCA TAAGACGGAC TACGGCCATG CCGGCGTGAC 
35221 CGTAAAAAAA CTGGTCACCG TGATTAAAAA GCACCACCGA CAGCTCCTCG GTCATGTCCG 
35281 GAGTCATAAT GTAAGACTCG GTAAACACAT CAGGTTGATT CATCGGTCAG TGCTAAAAAG 
35341 CGACCGAAAT AGCCCGGGGG AATACATACC CGCAGGCGTA GAGACAACAT TACAGCCCCC 
35401 ATAGGAGGTA TAACAAAATT AATAGGAGAG AAAAACACAT AAACACCTGA AAAACCCTCC 
35461 TGCCTAGGCA AAATAGCACC CTCCCGCTCC AGAACAACAT ACAGCGCTTC ACAGCGGCAG 
35521 CCTAACAGTC AGCCTTACCA GTAAAAAAGA AAACCTATTA AAAAAACACC ACTCGACACG 
35581 GCACCAGCTC AATCAGTCAC AGTGTAAAAA AGGGCCAAGT GCAGAGCGAG TATATATAGG 
35641 ACTAAAAAAT GACGTAACGG TTAAAGTCCA CAAAAAACAC CCAGAAAACC GCACGCGAAC 
35701 CTACGCCCAG AAACGAAAGC CAAAAAACCC ACAACTTCCT CAAATCGTCA CTTCCGTTTT 
35761 CCCACGTTAC GTAACTTCCC ATTTTAAGAA AACTACAATT CCCAACACAT ACAAGTTACT 
35821 CCGCCCTAAA ACCTACGTCA CCCGCCCCGT TCCCACGCCC CGCGCCACGT CACAAACTCC 
35881 ACCCCCTCAT TATCATATTG GCTTCAATCC AAAATAAGGT ATATTATTGA TGATG 
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Western blot on whole-cell extracts from 293 cells transfected with plasmid DNA expressing the 
different HCV NS cassettes. Mature NS3 and NS5A products were detected with specific antibodies. 
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IFNy EUspot on splenocytes from C57black6 mice immunized with two injections of 25ug DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNyELIspot on splenocytes from BalbC mice immunized with two injections of 50u.g DNA/dose with 
GET of plasmid vectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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Western blot on whole-cell extracts from HeLa cells infected at different multiplicity of infection 
(m.o.i.; indicated at the top) with Adenovectors expressing the different HCV NS cassettes. Mature 
NS5B and NS5A products were detected with specific antibodies. 
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IFNy ELISPOT on splenocytes from C57black6 mice immunized with two injections of 10 9 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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CFNy ELISPOT on PBMC from Rhesus monkeys immunized with one injection of 10 10 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy EL1SPOT on PBMC from Rhesus monkeys immunized with one injection of 10 10 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with two injections of 10 11 vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy ELISPOT on PBMC from Rhesus monkeys immunized with two injections of 10 n vp/dose of 
Adenovectors expressing the different HCV NS cassettes. Data are expressed as SFC/10 6 PBMC. 
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IFNy ICS on PBMC from Rhesus monkeys immunized with two injections at four weeks interval with 10 10 vp/dose 
of Adenovectors expressing the different HCV NS cassettes. Data are expressed as number of positive 
1FNV/CD3/CD8 per 10 6 lymphocytes. 
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IFNy ICS on PBMC from Rhesus monkeys immunized with two injections at four weeks interval with I0 11 vp/dose 
of Adenoveclors expressing the different HCV NS cassettes. Data are expressed as number of positive 
IFNy/CD3/CD8 per 10 s lymphocytes. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 M vp/dose of Ad5-NS. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 n vp/dose of Ad5-NS. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 n vp/dose of MRKAdS-NSmut. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 M vp/dose of MRKAd5-NSmut 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10»vp/dose of MRKAd6-NSmut. 
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Bulk CTL assays on PBMC from Rhesus monkeys immunized with two injections of 10 M vp/dose of MRKAd6-NSmut. 
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CGTGTGGAAG GACCTGCTGG AGGACACCGT GACCCCCATC GACACCACCA 
TCATGGCCAA GAACGAGGTG TTCTGCGTGC AGCCCGAGAA GGGCGGCCGC 
AAGCCCGCTC GCCTGATCGT GTTCCCCGAT CTGGGCGTGC GCGTGTGCGA 
GAAGATGGCC CTGTACGACG TGGTGAGCAC CCTGCCTCAG GTGGTGATGG 
GCTCAAGCTA CGGCTTCCAG TACAGCCCTG GCCAGCGCGT GGAGTTCCTG 
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4801 GTGAACACCT GGAAGAGCAA GAAGAACCCC ATGGGCTTCA GCTACGACAC 

4851 ACGCTGCTTC GACAGCACCG TGACCGAGAA CGACATCCGC GTGGAGGAGA 

4901 GCATCTACCA GTGCTGCGAC CTGGCCCCTG AGGCCAGGCA GGCCATCAAG 

4951 AGCCTGACCG AGCGCCTGTA CATCGGAGGC CCTCTGACCA ACAGCAAGGG 

5001 ACAGAACTGC GGATACAGGC GCTGTAGGGC CTCTGGCGTG CTGACCACCA 

5051 GCTGTGGCAA CACCCTGACC TGCTACCTGA AGGCCAGCGC TGCCTGTCGC 

5101 GCTGCCAAGC TGCAGGACTG CACCATGCTG GTGAACGCCG CTGGCCTGGT 

5151 GGTGATTTGT GAAAGCGCTG GCACCCAGGA AGATGCTGCC AGCCTGCGCG 

5201 TGTTCACCGA GGCCATGACC AGGTACTCTG CCCCTCCCGG AGACCCCCCT 

5251 CAGCCCGAAT ACGACCTGGA GCTGATCACC AGCTGCTCAA GCAACGTGAG 

5301 CGTGGCTCAC GACGCCAGCG GAAAGCGCGT GTACTACCTG ACACGCGATC 

5351 CCACCACCCC TCTGGCTCGC GCTGCCTGGG AAACCGCTCG CCATACACCC 

5401 GTGAACAGCT GGCTGGGCAA CATCATCATG TACGCCCCTA CCCTGTGGGC 

5451 TCGCATGATC CTGATGACCC ACTTCTTCAG CATCCTGCTG GCTCAGGAGC 

5501 AGCTGGAGAA GGCCCTGGAC TGCCAGATTT ACGGCGCTTG CTACAGCATC 

5551 GAGCCCCTGG ACCTGCCCCA AATCATCGAG CGCCTGCACG GCCTGTCTGC 

5601 CTTCAGCCTG CACAGCTACA GCCCTGGCGA AATTAATCGC GTGGCCAGCT 

5651 GTCTGCGCAA ACTGGGCGTG CCTCCTCTGC GCGTGTGGAG GCATAGGGCT 

57 01 AGGAGCGTGA GGGCTAGGCT GCTGAGCCAG GGAGGCAGGG CCGCTACCTG 

5751 TGGAAAGTAC CTGTTCAACT GGGCCGTGAA GACCAAGCTG AAGCTGACCC 

5801 CTATCCCTGC CGCTAGCCAG CTGGACCTGA GCGGATGGTT CGTGGCTGGC 

5851 TACAGCGGAG GCGACATCTA CCACAGCCTG TCTCGCGCTC GCCCTCGCTG 

5901 GTTCATGCTG TGCCTGCTGC TGCTGAGCGT GGGCGTGGGC ATCTACCTGC 

5951 TGCCCAACCG CTAAA 
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SEQUENCE LISTING 

<110> Merck & Co. Inc., and Istituto Di Ricerche Di Biologia Molecolare P. 
Angeletti S.P.A. 

<120> HEPATITIS C VIRUS VACCINE 



<130> ITR0015Y 

<150> 60/363,774 
<151> 2002-03-13 

<150> 60/328,655 
<151> 2001-10-11 



<160> 17 

<170> FastSEQ for Windows Version 4.0 



<210> 1 
<211> 1985 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Met-NS3-NS4A-NS4B-NS5A-NS5B polypeptide 



<400> 1 
Met Ala 
1 

Cys lie 

Glu Val 

Val Asn 

50 
Leu Ala 
65 

Gin Asp 

Pro Cys 

Asp Val 

Ser Pro 
130 
Leu Cys 
145 

Thr Arg 
Glu Thr 
Ala Val 



Pro lie Thr 
5 

lie Thr Ser 
20 

Gin Val Val 
35 

Gly Val Cys 

Gly Pro Lys 

Leu Val Gly 
85 

Thr Cys Gly 

100 
He Pro Val 
115 

Arg Pro Val 

Pro Ser Gly 

Gly Val Ala 
165 

Thr Met Arg 
180 

Pro Gin Ser 
195 



Ala Tyr Ser 

Leu Thr Gly 

Ser Thr Ala 
40 

Trp Thr Val 
55 

Gly Pro He 
70 

Trp Gin Ala 

Ser Ser Asp 

Arg Arg Arg 
120 

Ser Tyr Leu 

135 
His Ala Val 
150 

Lys Ala Val 

Ser Pro Val 

Phe Gin Val 
200 



Gin Gin Thr Arg 
10 

Arg Asp Lys Asn 
25 

Thr Gin Ser Phe 

Tyr His Gly Ala 
60 

Thr Gin Met Tyr 
75 

Pro Pro Gly Ala 
90 

Leu Tyr Leu Val 
105 

Gly Asp Ser Arg 

Lys Gly Ser Ser 
140 

Gly He Phe Arg 
155 

Asp Phe Val Pro 
170 

Phe Thr Asp Asn 
185 

Ala His Leu His 



Gly Leu Leu Gly 
15 

Glu Gly 



Gin Val 
30 

Leu Ala 
45 

Gly Ser 

Thr Asn 

Arg Ser 

Thr Arg 
110 
Gly Ser 
125 

Gly Gly 

Ala Ala 

Val Glu 

Ser Ser 
190 
Ala Pro 
205 



Thr Cys 

Lys Thr 

Val Asp 

80 
Leu Thr 
95 

His Ala 

Leu Leu 

Pro Leu 

Val Cys 
160 
Ser Met 
175 

Pro Pro 
Thr Gly 
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Ser Gly Lys Ser 
210 

Lys Val Leu Val 
225 

Ala Tyr Met Ser 

Val Arg Thr He 
260 

Lys Phe Leu Ala 
275 

lie Cys Asp Glu 
290 

Gly Thr Val Leu 
305 

Leu Ala Thr Ala 

He Glu Glu Val 
340 

Lys Ala He Pro 
355 

Cys His Ser Lys 
370 

Leu Gly He Asn 
385 

He Pro Thr He 

' Thr Gly Tyr Thr 
420 

Val Thr Gin Thr 
435 

Thr Thr Thr Val 
450 

Arg Thr Gly Arg 
465 

Glu Arg Pro Ser 

Asp Ala Gly Cys 
500 

Arg Leu Arg Ala 
515 

His Leu Glu Phe 
530 

Ala His Phe Leu 
545 

Leu Val Ala Tyr 

Pro Ser Trp Asp 
580 

Leu His Gly Pro 
595 

Glu Val Thr Leu 
610 

Ser Ala Asp Leu 
625 



Thr 


Lys 


Val 


Pro 






Z ID 




Leu 


Asn 


Pro 


Ser 




230 






Lys 


Ala 


nis 


\3±Y 


245 








Thr 


Thr 


Gly 


Ala 


Asp Gly 


Gly 


Cys 










Cys 


His 


Ser 


inr 










Asp 


Gin 


Ala 


GJLU 




310 






Thr 


Pro 


Pro 




325 








Ala 


Leu 


Ser 


Asn 


He 


Glu 


Ala 


Tl « 

i ie 








7 Cft 


Lys 


Lys 


Cys 


Asp 






O / D 




Ala 


Val 


n.±.d 


Tyr 




390 






Gly Asp 


Val 
val 


\r* 1 

Val 


405 








Gly Asp 


Phe 


Asp 


Val Asp 


rne 


Car 

oer 








A A ft 


Pro 


Gin 


Asp 


Ala 






A 

4DD 




Gly Arg 


Arg 






470 






Gly Met 


Phe 


Asp 


485 








Ala 


Trp 


Tyr 


Glu 


Tyr 


Leu 


Asn 


i nr 








Ron 


Trp 


Glu 


Ser 


\7a 1 
val 






j j j 




Ser 


Gin 


i nr 


Lys 




550 






Gin 


Ala 


Thr 


Val 


565 








Gin 


Met 


Trp 


Lys 


Thr 


Pro 


Leu 


Leu 








600 


Thr 


His 


Pro 


He 






615 




Glu 


Val 


Val 


Thr 




630 







Ala Ala Tyr Ala 
220 

Val Ala Ala Thr 
235 

He Asp Pro Asn 
250 

Pro Val Thr Tyr 
265 

Ser Gly Gly Ala 

Asp Ser Thr Thr 
300 



Thr 


Ala 

Aia 


vjiy 


a 1 sa 

Aia 










Ser 


vai 


Thr 


vai 




i "An 






Thr 


Gly 


Glu 


He 


345 








Arg 


Pi -vr 




Arg 


olu 


Leu 


Al a 
Ala 


Ala 
Ala 








7 ft 0 


iyr 












j ? j 




Val 
vox 


Val 

v ax 


Ala 


Thr 




410 






Ser 


Val 


He 


Asp 


425 








Leu 


Asp 


iri <J 


1 111 


Val 
val 




A19 


C pr 

OCX. 








4tDU 


He 


Tyr 


Arg 


Phe 






475 




Ser 


Ser 


Val 


Leu 




490 






Leu 


Thr 


Pro 


Ala 


505 








Pro 


Gly 


Leu 


Pro 


Phe 


Thr 


Gly 


Leu 








540 


Gin 


Ala 


Gly 


Asp 






555 




Cys 


Ala 


Arg 


Ala 




570 






Cys 


Leu 


He 


Arg 


585 








Tyr 


Arg 


Leu 


Gly 


Thr 


Lys 


Tyr 


He 








620 


Ser 


Thr 


Trp 


Val 






635 





Ala 


Gin Gly 


Tyr 


Leu 


Gly Phe 


V3ljr 










Tl a 

lie 


Arg 


Thr 








255 




Cor 


Thr Tyr 


CZ~\ v 

Vjljf 




270 






Tyr 


Asp 


He 


Tl » 

lie 


•JQC 

ZoZ> 








He 


Leu Gly 


He 


Arg 


Leu 


Val 


17a 1 

Val 








~i ? ft 


Pro 


His 


Pro 


Asn 






335 




•trx y 


Phe 


Tyr 


oiy 




350 






Hie 


Leu 


He 




JO J 








Lys 


Leu 


Ser 


Gly 


Asp 


Val 


Ser 


vai 










Asp 


Ala 


Leu 


nc l> 






415 




Cys 


Asn 


Thr 


P\/e 

uys 




430 






rile 


Thr 


He 


Glu 


AAR 








Gin 


Arg Arg 


Gly 


vai 


Thr 


Pro 


f2l V 










Cys 


Glu 


Cys 


lyr 






495 




pi Tl 

Ol u 


Thr 


Ser 


Val 




510 






vai 


Cys Gin 


Asp 










Thr 


His 


lie 


ASD 


Asn 


Phe 


Pro 


Tyr 








560 


Gin 


Ala 


Pro 


Pro 






575 




Leu 


Lys 


Pro 


Thr 




590 






Ala 


Val 


Gin 


Asn 


605 








Met 


Ala Cys 


Met 



Leu Val Gly Gly 
640 
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Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 

645 650 655 

lie Val Gly Arg He He Leu Ser Gly Arg Pro Ala He Val Pro Asp 

660 665 670 

Arg Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 

675 680 685 

His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 

690 695 700 

Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 
705 710 715 720 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 

725 730 735 

Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 

740 745 750 

Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 

755 760 765 

Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 

770 775 780 

Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
785 790 795 800 

Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 

805 810 815 

He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala 

820 825 830 

Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 

835 840 845 

Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 

850 855 860 

Glv Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His 
865 870 875 880 

Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 

885 890 895 

Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 

900 905 910 

Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr He 

915 920 925 

Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys Ser 

930 935 940 

Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp He Cys 
945 950 955 960 

Thr Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 

965 970 975 

Gin Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 

980 985 990 

Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala 

995 1000 1005 

Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val Gly Pro 

1010 1015 1020 

Lys Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro He Asn Ala Tyr 
1025 1030 1035 1040 

Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala 

1045 1050 1055 

Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val Gly 
1060 1065 1070 
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Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro 

1075 1080 1085 

Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg 

1090 1095 1100 

Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
1105 1110 1115 1120 

Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro 

1125 1130 1135 

Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 

1140 1145 1150 

Pro Ser His lie Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 

1155 1160 1165 

Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 

1170 1175 1180 

Ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp 
1185 1190 1195 1200 

Leu lie Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn lie 

1205 1210 1215 

Thr Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp 

1220 1225 1230 

Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu 

1235 1240 1245 

He Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala 

1250 1255 1260 

Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp 
1265 1270 1275 1280 

Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro He Lys Ala 

1285 1290 1295 

Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 

1300 1305 1310 

Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 

1315 1320 1325 

Ser Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro 

1330 1335 1340 

Asp Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr 
1345 1350 1355 1360 

Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 

1365 1370 1375 

Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 

1380 1385 1390 

Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys 

1395 1400 1405 

Ala Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu 

1410 1415 1420 

Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
1425 1430 1435 1440 

Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 

1445 1450 1455 

His Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 

1460 1465 1470 

Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 

1475 1480 1485 

His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn 
1490 1495 1500 
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Leu Ser Ser Lys Ala Val Asn His lie His Ser Val Trp Lys Asp Leu 
1505 1510 1515 1520 

Leu Glu Asp Thr Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn 

1525 1530 1535 

Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 

1540 1545 1550 

Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 

1555 1560 1565 

Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 

1570 1575 1580 

Tvr Glv Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 
!5 85 1590 1595 1600 

Thr Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 

1605 1610 1615 

Cvs Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser 

1620 1625 1630 

He Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys 

1635 1640 1645 

Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys 

1650 1655 1660 

Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 
1665 1670 1675 1680 

Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 

1685 1690 1695 

Cvs Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Ala Ala 

1700 1705 1710 

Glv Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 

1715 1720 1725 

Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 

1730 1735 1740 

Gly Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys 
1745 1750 1755 1760 

Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 

1765 1770 1775 

Tvr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu 

1780 1785 1790 

Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met 

1795 1800 1805 

Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe 

1810 1815 1820 

Ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
1825 1830 1835 1840 

He Tyr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He 

1845 1850 1855 

He Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 

I860 1865 1870 

Pro Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 

1875 1880 1885 

Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 

1890 ~ 1895 1900 

Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe 
1905 1910 1915 1920 

Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala 
1925 1930 1935 
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Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 

1940 1945 1950 

Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu 

1955 I960 1965 

Cys Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn 
1970 1975 1980 

Arg 
1985 

<210> 2 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Non-optimized cDNA sequence encoding SEQ. ID. NO. 
1 

<400> 2 

gccaccatgg cgcccatcac ggcctactcc caacagacgc ggggcctact tggttgcatc 60 

atcactagcc ttacaggccg ggacaagaac caggtcgagg gagaggttca ggtggtttcc 12 0 

accgcaacac aatccttcct ggcgacctgc gtcaacggcg tgtgttggac cgtttaccat 180 

ggtgctggct caaagacctt agccggccca aaggggccaa tcacccagat gtacactaat 240 

gtggaccagg acctcgtcgg ctggcaggcg ccccccgggg cgcgttcctt gacaccatgc 300 

acctgtggca gctcagacct ttacttggtc acgagacatg ctgacgtcat tccggtgcgc 360 

cggcggggcg acagtagggg gagcctgctc tcccccaggc ctgtctccta cttgaagggc 420 

tcttcgggtg gtccactgct ctgcccttcg gggcacgctg tgggcatctt ccgggctgcc 480 

gtatgcaccc ggggggttgc gaaggcggtg gactttgtgc ccgtagagtc catggaaact 540 

actatgcggt ctccggtctt cacggacaac tcatcccccc cggccgtacc gcagtcattt 600 

caagtggccc acctacacgc tcccactggc agcggcaaga gtactaaagt gccggctgca 6 60 

tatgcagccc aagggtacaa ggtgctcgtc ctcaatccgt ccgttgccgc taccttaggg 720 

tttggggcgt atatgtctaa ggcacacggt attgacccca acatcagaac tggggtaagg 780 

accattacca caggcgcccc cgtcacatac tctacctatg gcaagtttct tgccgatggt 840 

ggttgctctg ggggcgctta tgacatcata atatgtgatg agtgccattc aactgactcg 9 00 

actacaatct tgggcatcgg cacagtcctg gaccaagcgg agacggctgg agcgcggctt 960 

gtcgtgctcg ccaccgctac gcctccggga tcggtcaccg tgccacaccc aaacatcgag 1020 

gaggtggccc tgtctaatac tggagagatc cccttctatg gcaaagccat ccccattgaa 1080 

gccatcaggg ggggaaggca tctcattttc tgtcattcca agaagaagtg cgacgagctc 1140 

gccgcaaagc tgtcaggcct cggaatcaac gctgtggcgt attaccgggg gctcgatgtg 1200 

tccgtcatac caactatcgg agacgtcgtt gtcgtggcaa cagacgctct gatgacgggc 1260 

tatacgggcg actttgactc agtgatcgac tgtaacacat gtgtcaccca gacagtcgac 1320 

ttcagcttgg atcccacctt caeca ttgag acgacgaccg tgectcaaga cgcagtgtcg 13 80 

cgctcgcagc ggeggggtag gactggcagg ggtaggagag gcatctacag gtttgtgact 1440 

cegggagaac ggccctcggg catgttcgat tcctcggtcc tgtgtgagtg etatgacgeg 1500 

ggctgtgctt ggtacgagct cacccccgcc gagacctegg ttaggttgcg ggcctacctg 1560 

aacacaccag ggttgcccgt ttgecaggae cacctggagt tctgggagag tgtcttcaca 1620 

ggcctcaccc acatagatgc acacttcttg tcccagacca agcaggcagg agacaacttc 1680 

ccctacctgg tagcatacca agecaeggtg tgcgccaggg ctcaggcccc acctccatca 1740 

tgggatcaaa tgtggaagtg tetcataegg ctgaaaccta cgctgcacgg gccaacaccc 1800 

ttgctgtaca ggctgggagc cgtccaaaat gaggtcaccc tcacccaccc cataaccaaa 1860 

tacatcatgg catgcatgtc ggctgacctg gaggtegtea ctagcacctg ggtgctggtg 1920 

ggeggagtec ttgeagctet ggecgegtat tgcctgacaa caggcagtgt ggtcattgtg 1980 

ggtaggatta tettgtcegg gaggcegget attgttcccg acagggagtt tctctaccag 2040 

gagttcgatg aaatggaaga gtgcgcctcg cacctccctt acatcgagca gggaatgcag 2100 

ctcgccgagc aattcaagca gaaagegetc gggttactgc aaacagccac caaacaagcg 2160 
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gaggctgctg 

cacatgtgga 

aaccccgcaa 

caaagtaccc 

agcgccgctt 

cttgggaagg 

gtggccttca 

cctgccatcc 

cgacacgtgg 

tcgcggggta 

gttactcaga 

attaatgaag 

atatgcacgg 

ccgggagtcc 

atcatgcaaa 

atgaggatcg 

gcatacacca 

cgggtggccg 

ggcatgacca 

gaggtggacg 

gaggttacat 

cccgaaccgg 

gaaacggcta 

agccagttgt 

gctgacctca 

gtggagtcgg 

gatgagaggg 

gcgatgccca 

ccggactacg 

ataccacctc 

t tagcgga gc 

acggcgaccg 

tcgtactcct 

tcttggtcta 

acatggacag 

gcgttgagca 

gcaggcctgc 

cgggacgtgc 

gtagaggaag 

gcaaaggacg 

gacttgctgg 

ttctgtgtcc 

ctgggagtcc 

gtcgtgatgg 

gtgaatacct 

gactcaacgg 

ttggcccccg 

cctctgacta 

ctgacgacta 

gctgcgaagc 

gaaagcgcgg 

aggtactctg 

tcatgttcct 

acccgtgatc 

gttaactcct 



ctcccgtggt 

atttcatcag 

tagcatcatt 

tcctgtttaa 

cggctttcgt 

tgcttgtgga 

aggtcatgag 

tctctcctgg 

gtccgggaga 

atcatgtttc 

tcctctccag 

actgctccac 

tgttgactga 

cttttttctc 

ccacctgccc 

tcgggcctaa 

cgggcccctg 

ctgaggagta 

ctgacaacgt 

gagtgcggtt 

tccaggtcgg 

atgtagcagt 

agcgtaggtt 

ctgcgccttc 

tcgaggccaa 

agaacaaggt 

aagtatccgt 

tctgggcgcg 

tccctccggt 

cacggagaaa 

tcgctactaa 

cccttcctga 

ccatgccccc 

ccgtgagcga 

gcgccttgat 

actctttgct 

ggcagaagaa 

tcaaggagat 

cctgcaagct 

tccggaacct 

aagacactgt 

aaccagagaa 

gtgtatgcga 

gctcctcata 

ggaaatcaaa 

tcaccgagaa 

aagccagaca 

attcaaaagg 

gctgcggtaa 

tccaggactg 

gaacccaaga 

ccccccccgg 

ccaatgtgtc 

ccaccacccc 

ggctaggcaa 



ggagtccaag 

cgggatacag 

gatggcattc 

catcttgggg 

gggcgccggc 

cattctggcg 

cggcgagatg 

cgccctggtc 

gggggctgtg 

ccccacgcac 

ccttaccatc 

accgtgttcc 

cttcaagacc 

gtgccaacgc 

atgtggagca 

gacctgcagc 

cacaccctct 

cgtggaggtc 

aaagtgccca 

gcacaggtac 

gctcaaccaa 

gctcacttcc 

ggccaggggg 

cttgaaggcg 

cctcctgtgg 

ggtagtcctg 

tccggcggag 

cccggattac 

ggtgcacggg 

gaggacggtt 

gaccttcggc 

ccaggcctcc 

ccttgagggg 

ggaagctagt 

cacgccatgc 

gcgccaccat 

ggtcaccttt 

gaaggcgaag 

gacgccccca 

atccagcaag 

gacaccaatt 

aggaggccgt 

gaagatggcc 

cggattccag 

gaaaaacccc 

cgacatccgt 

ggccataaaa 

gcagaactgc 

caccctcaca 

cacgatgctc 

ggacgcggcg 

ggacccgccc 

ggtcgcccac 

cctcgcacgg 

cattatcatg 



tggcgagccc 

tacttagcag 

acagcctcta 

gggtgggtgg 

atcgccggtg 

ggttatggag 

ccctccaccg 

gtcggggtcg 

cagtggatga 

tatgtgcctg 

actcagctgc 

ggctcgtggc 

tggctccagt 

gggtacaagg 

cagatcaccg 

aacacgtggc 

ccagcgccaa 

acgcgggtgg 

tgccaggttc 

gctccggcgt 

tacctggttg 

atgctcaccg 

tctcccccct 

acatgcacta 

cggcaggaga 

gactctttcg 

atcctgcgga 

aaccctccac 

tgcccgttgc 

gtcctaacag 

agctccgaat 

gacgacggtg 

gaaccggggg 

gaggatgtcg 

gctgcggagg 

aacatggttt 

gacagactgc 

gcgtccacag 

cattcggcca 

gccgttaacc 

gacaccacca 

aagccagccc 

ctctatgatg 

tactctcctg 

atgggctttt 

gttgaggagt 

tcgctcacag 

ggttatcgcc 

tgttacttga 

gtgaacgccg 

agcctacgag 

caaccagaat 

gatgcatcag 

gctgcgtggg 

tatgcgccca 



ttgagacatt 

gcttatccac 

tcaccagccc 

ctgcccaact 

cggctgttgg 

caggagtggc 

aggacctggt 

tgtgtgcagc 

accggctgat 

agagcgacgc 

tgaaaaggct 

taagggatgt 

ccaagctcct 

gagtctggcg 

gacatgtcaa 

atggaacatt 

actattctag 

gggatttcca 

cggctcctga 

gcaggcctct 

ggtcacagct 

acccctccca 

ccttggccag 

cccaccatgt 

tgggcgggaa 

acccgcttcg 

aatccaagaa 

tgttagagtc 

cacctatcaa 

agtcctccgt 

catcggccgt 

acaaaggatc 

accccgatct 

tctgctgctc 

aaagcaagct 

atgccacaac 

aagtcctgga 

ttaaggctaa 

aatccaagtt 

acatccactc 

tcatggcaaa 

gccttatcgt 

tggtctccac 

ggcagcgagt 

catatgacac 

caatttacca 

agcggcttta 

ggtgccgcgc 

aggcctctgc 

ccggccttgt 

tcttcacgga 

acgacttgga 

gcaaaagggt 

aaacagctag 

ctttgtgggc 



ctgggcgaag 

tctgcctggg 

gctcaccacc 

cgcccccccc 

cagcataggc 

cggcgcgctc 

caatctactt 

aatactgcgt 

agcgttcgcc 

cgcagcgcgt 

ccaccagtgg 

ttgggactgg 

gccgcagcta 

gggagacggc 

aaacggttcc 

ccccatcaac 

ggcgctgtgg 

ctacgtgacg 

attcttcacg 

cctacgggag 

accatgcgag 

catcacagca 

ctcttcagct 

ctctccggac 

catcacccgc 

agcggaggag 

gttccccgca 

ctggaaggac 

ggcccctcca 

gtcttctgcc 

cgacagcggc 

cgacgttgag 

cagtgacggg 

aatgtcctac 

gcccatcaac 

atctcgcagc 

cgaccactac 

actcctatcc 

tggctatggg 

cgtgtggaag 

aaatgaggtt 

attcccagat 

ccttcctcag 

cgagttcctg 

tcgctgtttc 

atgttgtgac 

tatcgggggt 

gagcggcgtg 

agcctgtcga 

cgttatctgt 

ggctatgact 

gctgataaca 

gtactacctc 

acacactcca 

aaggatgatt 



2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 
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ctgatgactc acttcttctc catccttcta gcacaggagc aacttgaaaa agccctggac 5520 

tgccagatct acggggcctg ttactccatt gagccacttg acctacctca gatcattgaa 5580 

cgactccatg gccttagcgc attttcactc catagttact ctccaggtga gatcaatagg 5640 

gtggcttcat gcctcaggaa acttggggta ccacccttgc gagtctggag acatcgggcc 5700 

aggagcgtcc gcgctaggct actgtcccag ggggggaggg ccgccacttg tggcaagtac 5760 

ctcttcaact gggcagtgaa gaccaaactc aaactcactc caatcccggc tgcgtcccag 5820 

ctggacttgt ccggctggtt cgttgctggt tacagcgggg gagacatata tcacagcctg 5880 

tctcgtgccc gaccccgctg gttcatgctg tgcctactcc tactttctgt aggggtaggc 5940 

atctacctgc tccccaaccg ataaa 5965 

<210> 3 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Optimized cDNA encoding SEQ ID NO: 1 
<400> 3 

gccaccatgg cccccatcac cgcctacagc cagcagaccc gcggcctgct gggctgcatc 60 

atcaccagcc tgaccggccg cgacaagaac caggtggagg gcgaggtgca ggtggtgagc 120 

accgccaccc agagcttcct ggccacctgc gtgaacggcg tgtgctggac cgtgtaccac 180 

ggcgccggca gcaagaccct ggccggcccc aagggcccca tcacccagat gtacaccaac 240 

gtggaccagg acctggtggg ctggcaggcc ccccccggcg cccgcagcct gaccccctgc 300 

acctgcggca gcagcgacct gtacctggtg acccgccacg ccgacgtgat ccccgtgcgc 360 

cgccgcggcg acagccgcgg cagcctgctg agcccccgcc ccgtgagcta cctgaagggc 420 

agcagcggcg gccccctgct gtgccccagc ggccacgccg tgggcatctt ccgcgccgcc 480 

gtgtgcaccc gcggcgtggc caaggccgtg gacttcgtgc ccgtggagag catggagacc 540 

accatgcgca gccccgtgtt caccgacaac agcagccccc ccgccgtgcc ccagagcttc 600 

caggtggccc acctgcacgc ccccaccggc agcggcaaga gcaccaaggt gcccgccgcc 660 

tacgccgccc agggctacaa ggtgctggtg ctgaacccca gcgtggccgc caccctgggc 720 

ttcggcgcct acatgagcaa ggcccacggc atcgacccca acatccgcac cggcgtgcgc 780 

accatcacca ccggcgcccc cgtgacctac agcacctacg gcaagttcct ggccgacggc 840 

ggctgcagcg gcggcgccta cgacatcatc atctgcgacg agtgccacag caccgacagc 900 

accaccatcc tgggcatcgg caccgtgctg gaccaggccg agaccgccgg cgcccgcctg 960 

gtggtgctgg ccaccgccac cccccccggc agcgtgaccg tgccccaccc caacatcgag 1020 

gaggtggccc tgagcaacac cggcgagatc cccttctacg gcaaggccat ccccatcgag 1080 

gccatccgcg gcggccgcca cctgatcttc tgccacagca agaagaagtg cgacgagctg 1140 

gccgccaagc tgagcggcct gggcatcaac gccgtggcct actaccgcgg cctggacgtg 1200 

agcgtgatcc ccaccatcgg cgacgtggtg gtggtggcca ccgacgccct gatgaccggc 12 60 

tacaccggcg acttcgacag cgtgatcgac tgcaacacct gcgtgaccca gaccgtggac 1320 

ttcagcctgg accccacctt caccatcgag accaccaccg tgccccagga cgccgtgagc 1380 

cgcagccagc gccgcggccg caccggccgc ggccgccgcg gcatctaccg cttcgtgacc 1440 

cccggcgagc gccccagcgg catgttcgac agcagcgtgc tgtgcgagtg ctacgacgcc 1500 

ggctgcgcct ggtacgagct gacccccgcc gagaccagcg tgcgcctgcg cgcctacctg 1560 

aacacccccg gcctgcccgt gtgccaggac cacctggagt tctgggagag cgtgttcacc 1620 

ggcctgaccc acatcgacgc ccacttcctg agccagacca agcaggccgg cgacaacttc 1680 

ccctacctgg tggcctacca ggccaccgtg tgcgcccgcg cccaggcccc cccccccagc 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccccaccccc 1800 

ctgctgtacc gcctgggcgc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgccgacctg gaggtggtga ccagcacctg ggtgctggtg 1920 

ggcggcgtgc tggccgccct ggccgcctac tgcctgacca ccggcagcgt ggtgatcgtg 1980 

ggccgcatca tcctgagcgg ccgccccgcc atcgtgcccg accgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgcgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgagc agttcaagca gaaggccctg ggcctgctgc agaccgccac caagcaggcc 2160 
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gaggccgccg cccccgtggt ggagagcaag tggcgcgccc tggagacctt ctgggccaag 2220 

cacatgtgga acttcatcag cggcatccag tacctggccg gcctgagcac cctgcccggc 2280 

aaccccgcca tcgccagcct gatggccttc accgccagca tcaccagccc cctgaccacc 2340 

cagagcaccc tgctgttcaa catcctgggc ggctgggtgg ccgcccagct ggcccccccc 2400 

agcgccgcca gcgccttcgt gggcgccggc atcgccggcg ccgccgtggg cagcatcggc 2460 

ctgggcaagg tgctggtgga catcctggcc ggctacggcg ccggcgtggc cggcgccctg 252 0 

gtggccttca aggtgatgag cggcgagatg cccagcaccg aggacctggt gaacctgctg 2580 

cccgccatcc tgagccccgg cgccctggtg gtgggcgtgg tgtgcgccgc catcctgcgc 2640 

cgccacgtgg gccccggcga gggcgccgtg cagtggatga accgcctgat cgccttcgcc 2700 

agccgcggca accacgtgag ccccacccac tacgtgcccg agagcgacgc cgccgcccgc 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 282 0 

atcaacgagg actgcagcac cccctgcagc ggcagctggc tgcgcgacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccagctg 2940 

cccggcgtgc ccttcttcag ctgccagcgc ggctacaagg gcgtgtggcg cggcgacggc 3000 

atcatgcaga ccacctgccc ctgcggcgcc cagatcaccg gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccccaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggcccctg cacccccagc cccgccccca actacagccg cgccctgtgg 3180 

cgcgtggccg ccgaggagta cgtggaggtg acccgcgtgg gcgacttcca ctacgtgacc 3240 

ggcatgacca ccgacaacgt gaagtgcccc tgccaggtgc ccgcccccga gttcttcacc 3300 

gaggtggacg gcgtgcgcct gcaccgctac gcccccgcct gccgccccct gctgcgcgag 3360 

gaggtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cccgagcccg acgtggccgt gctgaccagc atgctgaccg accccagcca catcaccgcc 3480 

gagaccgcca agcgccgcct ggcccgcggc agccccccca gcctggccag cagcagcgcc 3540 

agccagctga gcgcccccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3 660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgcc 3780 

gccatgccca tctgggcccg ccccgactac aacccccccc tgctggagag ctggaaggac 3840 

cccgactacg tgccccccgt ggtgcacggc tgccccctgc cccccatcaa ggcccccccc 3900 

atcccccccc cccgccgcaa gcgcaccgtg gtgctgaccg agagcagcgt gagcagcgcc 3960 

ctggccgagc tggccaccaa gaccttcggc agcagcgaga gcagcgccgt ggacagcggc 4020 

accgccaccg ccctgcccga ccaggccagc gacgacggcg acaagggcag cgacgtggag 4080 

agctacagca gcatgccccc cctggagggc gagcccggcg accccgacct gagcgacggc 4140 

agctggagca ccgtgagcga ggaggccagc gaggacgtgg tgtgctgcag catgagctac 4200 

acctggaccg gcgccctgat caccccctgc gccgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gcgccaccac aacatggtgt acgccaccac cagccgcagc 432 0 

gccggcctgc gccagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgacgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgccc gcctgatcgt gttccccgac 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgccccag 4740 

gtggtgatgg gcagcagcta cggcttccag tacagccccg gccagcgcgt ggagttcctg 4800 

gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac ccgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccccg aggcccgcca ggccatcaag agcctgaccg agcgcctgta catcggcggc 4980 

cccctgacca acagcaaggg ccagaactgc ggctaccgcc gctgccgcgc cagcggcgtg 5040 

ctgaccacca gctgcggcaa caccctgacc tgctacctga aggccagcgc cgcctgccgc 5100 

gccgccaagc tgcaggactg caccatgctg gtgaacgccg ccggcctggt ggtgatctgc 5160 

gagagcgccg gcacccagga ggacgccgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

cgctacagcg ccccccccgg cgaccccccc cagcccgagt acgacctgga gctgatcacc 5280 

agctgcagca gcaacgtgag cgtggcccac gacgccagcg gcaagcgcgt gtactacctg 5340 

acccgcgacc ccaccacccc cctggcccgc gccgcctggg agaccgcccg ccacaccccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccca ccctgtgggc ccgcatgatc 5460 
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ctgatgaccc acttcttcag catcctgctg gcccaggagc agctggagaa ggccctggac 5520 

tgccagatct acggcgcctg ctacagcatc gagcccctgg acctgcccca gatcatcgag 5580 

cgcctgcacg gcctgagcgc cttcagcctg cacagctaca gccccggcga gatcaaccgc 5640 

gtggccagct gcctgcgcaa gctgggcgtg ccccccctgc gcgtgtggcg ccaccgcgcc 5700 

cgcagcgtgc gcgcccgcct gctgagccag ggcggccgcg ccgccacctg cggcaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ccatccccgc cgccagccag 5820 

ctggacctga gcggctggtt cgtggccggc tacagcggcg gcgacatcta ccacagcctg 5880 

agccgcgccc gcccccgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

atctacctgc tgcccaaccg ctaaa 5965 



<210> 4 
<211> 37090 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> MRKAd6-NSmut nucleic acid 



<400> 4 

catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 60 

ttgtgacgtg- gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 12 0 

gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 180 

gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 240 

taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 300 

agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 360 

gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 420 

cgggtcaaag ttggcgtttt attattatag gcggccgcga tccattgcat acgttgtatc 480 

catatcataa tatgtacatt tatattggct catgtccaac attaccgcca tgttgacatt 540 

gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata 600 

tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc 660 

cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc 720 

attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt 780 

atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt 840 

atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca 900 

tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg 960 

actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc 1020 

aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg 1080 

gtaggcgtgt acggtgggag gtctatataa gcagagctcg tttagtgaac cgtcagatcg 1140 

cctggagacg ccatccacgc tgttttgacc tccatagaag acaccgggac cgatccagcc 1200 

tccgcggccg ggaacggtgc attggaacgc ggattccccg tgccaagagt gagatctgcc 1260 

accatggcgc ccatcacggc ctactcccaa cagacgcggg gcctacttgg ttgcatcatc 1320 

actagcctta caggccggga caagaaccag gtcgagggag aggttcaggt ggtttccacc 1380 

gcaacacaat ccttcctggc gacctgcgtc aacggcgtgt gttggaccgt ttaccatggt 1440 

gctggctcaa agaccttagc cggcccaaag gggccaatca cccagatgta cactaatgtg 1500 

gaccaggacc tcgtcggctg gcaggcgccc cccggggcgc gttccttgac accatgcacc 1560 

tgtggcagct cagaccttta cttggtcacg agacatgctg acgtcattcc ggtgcgccgg 1620 

cggggcgaca gtagggggag cctgctctcc cccaggcctg tctcctactt gaagggctct 1680 

tcgggtggtc cactgctctg cccttcgggg cacgctgtgg gcatcttccg ggctgccgta 1740 

tgcacccggg gggttgcgaa ggcggtggac tttgtgcccg tagagtccat ggaaactact 1800 

atgcggtctc cggtcttcac ggacaactca tcccccccgg ccgtaccgca gtcatttcaa 1860 

gtggcccacc tacacgctcc cactggcagc ggcaagagta ctaaagtgcc ggctgcatat 1920 

gcagcccaag ggtacaaggt gctcgtcctc aatccgtccg ttgccgctac cttagggttt 1980 

ggggcgtata tgtctaaggc acacggtatt gaccccaaca tcagaactgg ggtaaggacc 2040 

attaccacag gcgcccccgt cacatactct acctatggca agtttcttgc cgatggtggt 2100 

tgctctgggg gcgcttatga catcataata tgtgatgagt gccattcaac tgactcgact 2160 
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acaatcttgg gcatcggcac agtcctggac caagcggaga cggctggagc gcggcttgtc 2220 

gtgctcgcca ccgctacgcc tccgggatcg gtcaccgtgc cacacccaaa catcgaggag 2280 

gtggccctgt ctaatactgg agagatcccc ttctatggca aagccatccc cattgaagcc 2340 

atcagggggg gaaggcatct cattttctgt cattccaaga agaagtgcga cgagctcgcc 2400 

gcaaagctgt caggcctcgg aatcaacgct gtggcgtatt accgggggct cgatgtgtcc 2460 

gtcataccaa ctatcggaga cgtcgttgtc gtggcaacag acgctctgat gacgggctat 2520 

acgggcgact ttgactcagt gatcgactgt aacacatgtg tcacccagac agtcgacttc 2580 

agcttggatc ccaccttcac cattgagacg acgaccgtgc ctcaagacgc agtgtcgcgc 2640 

tcgcagcggc ggggtaggac tggcaggggt aggagaggca tctacaggtt tgtgactccg 2700 

ggagaacggc cctcgggcat gttcgattcc tcggtcctgt gtgagtgcta tgacgcgggc 2760 

tgtgcttggt acgagctcac ccccgccgag acctcggtta ggttgcgggc ctacctgaac 2820 

acaccagggt tgcccgtttg ccaggaccac ctggagttct gggagagtgt cttcacaggc 2880 

ctcacccaca tagatgcaca cttcttgtcc cagaccaagc aggcaggaga caacttcccc 2940 

tacctggtag cataccaagc cacggtgtgc gccagggctc aggccccacc tccatcatgg 3000 

gatcaaatgt ggaagtgtct catacggctg aaacctacgc tgcacgggcc aacacccttg 3060 

ctgtacaggc tgggagccgt ccaaaatgag gtcaccctca cccaccccat aaccaaatac 3120 

atcatggcat gcatgtcggc tgacctggag gtcgtcacta gcacctgggt gctggtgggc 3180 

ggagtccttg cagctctggc cgcgtattgc ctgacaacag gcagtgtggt cattgtgggt 3240 

aggattatct tgtccgggag gccggctatt gttcccgaca gggagtttct ctaccaggag 3300 

ttcgatgaaa tggaagagtg cgcctcgcac ctcccttaca tcgagcaggg aatgcagctc 33 60 

gccgagcaat tcaagcagaa agcgctcggg ttactgcaaa cagccaccaa acaagcggag 3420 

gctgctgctc ccgtggtgga gtccaagtgg cgagcccttg agacattctg ggcgaagcac 3480 

atgtggaatt tcatcagcgg gatacagtac ttagcaggct tatccactct gcctgggaac 3540 

cccgcaatag catcattgat ggcattcaca gcctctatca ccagcccgct caccacccaa 3600 

agtaccctcc tgtttaacat cttggggggg tgggtggctg cccaactcgc cccccccagc 3660 

gccgcttcgg ctttcgtggg cgccggcatc gccggtgcgg ctgttggcag cataggcctt 3720 

gggaaggtgc ttgtggacat tctggcgggt tatggagcag gagtggccgg cgcgctcgtg 3780 

gccttcaagg tcatgagcgg cgagatgccc tccaccgagg acctggtcaa tctacttcct 3840 

gccatcctct ctcctggcgc cctggtcgtc ggggtcgtgt gtgcagcaat actgcgtcga 3900 

cacgtgggtc cgggagaggg ggctgtgcag tggatgaacc ggctgatagc gttcgcctcg 3960 

cggggtaatc atgtttcccc cacgcactat gtgcctgaga gcgacgccgc agcgcgtgtt 4020 

actcagatcc tctccagcct taccatcact cagctgctga aaaggctcca ccagtggatt 4080 

aatgaagact gctccacacc gtgttccggc tcgtggctaa gggatgtttg ggactggata 4140 

tgcacggtgt tgactgactt caagacctgg ctccagtcca agctcctgcc gcagctaccg 4200 

ggagtccctt ttttctcgtg ccaacgcggg tacaagggag tctggcgggg agacggcatc 4260 

atgcaaacca cctgcccatg tggagcacag atcaccggac atgtcaaaaa cggttccatg 4320 

aggatcgtcg ggcctaagac ctgcagcaac acgtggcatg gaacattccc catcaacgca 4380 

tacaccacgg gcccctgcac accctctcca gcgccaaact attctagggc gctgtggcgg 4440 

gtggccgctg aggagtacgt ggaggtcacg cgggtggggg atttccacta cgtgacgggc 4500 

atgaccactg acaacgtaaa gtgcccatgc caggttccgg ctcctgaatt cttcacggag 4560 

gtggacggag tgcggttgca caggtacgct ccggcgtgca ggcctctcct acgggaggag 4620 

gttacattcc aggtcgggct caaccaatac ctggttgggt cacagctacc atgcgagccc 4680 

gaaccggatg tagcagtgct cacttccatg ctcaccgacc cctcccacat cacagcagaa 4740 

acggctaagc gtaggttggc cagggggtct cccccctcct tggccagctc ttcagctagc 4800 

cagttgtctg cgccttcctt gaaggcgaca tgcactaccc accatgtctc tccggacgct 4860 

gacctcatcg aggccaacct cctgtggcgg caggagatgg gcgggaacat cacccgcgtg 4920 

gagtcggaga acaaggtggt agtcctggac tctttcgacc cgcttcgagc ggaggaggat 4980 

gagagggaag tatccgttcc ggcggagatc ctgcggaaat ccaagaagtt ccccgcagcg 5040 

atgcccatct gggcgcgccc ggattacaac cctccactgt tagagtcctg gaaggacccg 5100 

gactacgtcc ctccggtggt gcacgggtgc ccgttgccac ctatcaaggc ccctccaata 5160 

ccacctccac ggagaaagag gacggttgtc ctaacagagt cctccgtgtc ttctgcctta 5220 

gcggagctcg ctactaagac cttcggcagc tccgaatcat cggccgtcga cagcggcacg 5280 

gcgaccgccc ttcctgacca ggcctccgac gacggtgaca aaggatccga cgttgagtcg 5340 

tactcctcca tgccccccct tgagggggaa ccgggggacc ccgatctcag tgacgggtct 5400 

tggtctaccg tgagcgagga agctagtgag gatgtcgtct gctgctcaat gtcctacaca 5460 
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tggacaggcg ccttgatcac gccatgcgct gcggaggaaa gcaagctgcc catcaacgcg 5520 

ttgagcaact ctttgctgcg ccaccataac atggtttatg ccacaacatc tcgcagcgca 5580 

ggcctgcggc agaagaaggt cacctttgac agactgcaag tcctggacga ccactaccgg 5640 

gacgtgctca aggagatgaa ggcgaaggcg tccacagtta aggctaaact cctatccgta 5700 

gaggaagcct gcaagctgac gcccccacat tcggccaaat ccaagtttgg ctatggggca 57 60 

aaggacgtcc ggaacctatc cagcaaggcc gttaaccaca tccactccgt gtggaaggac 5820 

ttgctggaag acactgtgac accaattgac accaccatca tggcaaaaaa tgaggttttc 5880 

tgtgtccaac cagagaaagg aggccgtaag ccagcccgcc ttatcgtatt cccagatctg 5940 

ggagtccgtg tatgcgagaa gatggccctc tatgatgtgg tctccaccct tcctcaggtc 6000 

gtgatgggct cctcatacgg attccagtac tctcctgggc agcgagtcga gttcctggtg 6060 

aatacctgga aatcaaagaa aaaccccatg ggcttttcat atgacactcg ctgtttcgac 6120 

tcaacggtca ccgagaacga catccgtgtt gaggagtcaa tttaccaatg ttgtgacttg 6180 

gcccccgaag ccagacaggc cataaaatcg ctcacagagc ggctttatat cgggggtcct 6240 

ctgactaatt caaaagggca gaactgcggt tatcgccggt gccgcgcgag cggcgtgctg 6300 

acgactagct gcggtaacac cctcacatgt tacttgaagg cctctgcagc ctgtcgagct 6360 

gcgaagctcc aggactgcac gatgctcgtg aacgccgccg gccttgtcgt tatctgtgaa 6420 

agcgcgggaa cccaagagga cgcggcgagc ctacgagtct tcacggaggc tatgactagg 6480 

tactctgccc cccccgggga cccgccccaa ccagaatacg acttggagct gataacatca 6540 

tgttcctcca atgtgtcggt cgcccacgat gcatcaggca aaagggtgta ctacctcacc 6600 

cgtgatccca ccacccccct cgcacgggct gcgtgggaaa cagctagaca cactccagtt 6660 

aactcctggc taggcaacat tatcatgtat gcgcccactt tgtgggcaag gatgattctg 6720 

atgactcact tcttctccat ccttctagca caggagcaac ttgaaaaagc cctggactgc 6780 

cagatctacg gggcctgtta ctccattgag ccacttgacc tacctcagat cattgaacga 6840 

ctccatggcc ttagcgcatt ttcactccat agttactctc caggtgagat caatagggtg 6900 

gcttcatgcc tcaggaaact tggggtacca cccttgcgag tctggagaca tcgggccagg 6960 

agcgtccgcg ctaggctact gtcccagggg gggagggccg ccacttgtgg caagtacctc 7020 

ttcaactggg cagtgaagac caaactcaaa ctcactccaa tcccggctgc gtcccagctg 7080 

gacttgtccg gctggttcgt tgctggttac agcgggggag acatatatca cagcctgtct 7140 

cgtgcccgac cccgctggtt catgctgtgc ctactcctac tttctgtagg ggtaggcatc 7200 

tacctgctcc ccaaccggta aatctagagc tgtgccttct agttgccagc catctgttgt 7260 

ttgcccctcc cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta 7320 

ataaaatgag gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg 7380 

ggtggggcag gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc 7440 

ggtgggctct atggccgatc ggcgcgccgt actgaaatgt gtgggcgtgg cttaagggtg 7500 

ggaaagaata tataaggtgg gggtcttatg tagttttgta tctgttttgc agcagccgcc 7560 

gccgccatga gcaccaactc gtttgatgga agcattgtga gctcatattt gacaacgcgc 7620 

atgcccccat gggccggggt gcgtcagaat gtgatgggct ccagcattga tggtcgcccc 7680 

gtcctgcccg caaactctac taccttgacc tacgagaccg tgtctggaac gccgttggag 7740 

actgcagcct ccgccgccgc ttcagccgct gcagccaccg cccgcgggat tgtgactgac 7800 

tttgctttcc tgagcccgct tgcaagcagt gcagcttccc gttcatccgc ccgcgatgac 7860 

aagttgacgg ctcttttggc acaattggat tctttgaccc gggaacttaa tgtcgtttct 7920 

cagcagctgt tggatctgcg ccagcaggtt tctgccctga aggcttcctc ccctcccaat 7980 

gcggtttaaa acataaataa aaaaccagac tctgtttgga tttggatcaa gcaagtgtct 8040 

tgctgtcttt atttaggggt tttgcgcgcg cggtaggccc gggaccagcg gtctcggtcg 8100 

ttgagggtcc tgtgtatttt ttccaggacg tggtaaaggt gactctggat gttcagatac 8160 

atgggcataa gcccgtctct ggggtggagg tagcaccact gcagagcttc atgctgcggg 8220 

gtggtgttgt agatgatcca gtcgtagcag gagcgctggg cgtggtgcct aaaaatgtct 8280 

ttcagtagca agctgattgc caggggcagg cccttggtgt aagtgtttac aaagcggtta 8340 

agctgggatg ggtgcatacg tggggatatg agatgcatct tggactgtat ttttaggttg 8400 

gctatgttcc cagccatatc cctccgggga ttcatgttgt gcagaaccac cagcacagtg 8460 

tatccggtgc acttgggaaa tttgtcatgt agcttagaag gaaatgcgtg gaagaacttg 8520 

gagacgccct tgtgacctcc aagattttcc atgcattcgt ccataatgat ggcaatgggc 8580 

ccacgggcgg cggcctgggc gaagatattt ctgggatcac taacgtcata gttgtgttcc 8640 

aggatgagat cgtcataggc catttttaca aagcgcgggc ggagggtgcc agactgcggt 8700 

ataatggttc catccggccc aggggcgtag ttaccctcac agatttgcat ttcccacgct 8760 



12/64 



WO 03/031588 



PCT/US02/32512 



ttgagttcag atggggggat catgtctacc tgcggggcga tgaagaaaac ggtttccggg 8820 

gtaggggaga tcagctggga agaaagcagg ttcctgagca gctgcgactt accgcagccg 8880 

gtgggcccgt aaatcacacc tattaccggc tgcaactggt agttaagaga gctgcagctg 8940 

ccgtcatccc tgagcagggg ggccacttcg ttaagcatgt ccctgactcg catgttttcc 9000 

ctgaccaaat ccgccagaag gcgctcgccg cccagcgata gcagttcttg caaggaagca 9060 

aagtttttca acggtttgag accgtccgcc gtaggcatgc ttttgagcgt ttgaccaagc 9120 

agttccaggc ggtcccacag ctcggtcacc tgctctacgg catctcgatc cagcatatct 9180 

cctcgtttcg cgggttgggg cggctttcgc tgtacggcag tagtcggtgc tcgtccagac 9240 

gggccagggt catgtctttc cacgggcgca gggtcctcgt cagcgtagtc tgggtcacgg 9300 

tgaaggggtg cgctccgggc tgcgcgctgg ccagggtgcg cttgaggctg gtcctgctgg 9360 

tgctgaagcg ctgccggtct tcgccctgcg cgtcggccag gtagcatttg accatggtgt 9420 

catagtccag cccctccgcg gcgtggccct tggcgcgcag cttgcccttg gaggaggcgc 9480 

cgcacgaggg gcagtgcaga cttttgaggg cgtagagctt gggcgcgaga aataccgatt 9540 

ccggggagta ggcatccgcg ccgcaggccc cgcagacggt ctcgcattcc acgagccagg 9600 

tgagctctgg ccgttcgggg tcaaaaacca ggtttccccc atgctttttg atgcgtttct 9660 

tacctctggt ttccatgagc cggtgtccac gctcggtgac gaaaaggctg tccgtgtccc 9720 

cgtatacaga cttgagaggc ctgtcctcga gcggtgttcc gcggtcctcc tcgtatagaa 9780 

actcggacca ctctgagacg aaggctcgcg tccaggccag cacgaaggag gctaagtggg 9840 

aggggtagcg gtcgttgtcc actagggggt ccactcgctc cagggtgtga agacacatgt 9900 

cgccctcttc ggcatcaagg aaggtgattg gtttataggt gtaggccacg tgaccgggtg 9960 

ttcctgaagg ggggctataa aagggggtgg gggcgcgttc gtcctcactc tcttccgcat 10020 

cgctgtctgc gagggccagc tgttggggtg agtactccct ctcaaaagcg ggcatgactt 10080 

ctgcgctaag attgtcagtt tccaaaaacg aggaggattt gatattcacc tggcccgcgg 10140 

tgatgccttt gagggtggcc gcgtccatct ggtcagaaaa gacaatcttt ttgttgtcaa 10200 

gcttggtggc aaacgacccg tagagggcgt tggacagcaa cttggcgatg gagcgcaggg 10260 

tttggttttt gtcgcgatcg gcgcgctcct tggccgcgat gtttagctgc acgtattcgc 10320 

gcgcaacgca ccgccattcg ggaaagacgg tggtgcgctc gtcgggcact aggtgcacgc 10380 

gccaaccgcg gttgtgcagg gtgacaaggt caacgctggt ggctacctct ccgcgtaggc 10440 

gctcgttggt ccagcagagg cggccgccct tgcgcgagca gaatggcggt agtgggtcta 10500 

gctgcgtctc gtccgggggg tctgcgtcca cggtaaagac cccgggcagc aggcgcgcgt 10560 

cgaagtagtc tatcttgcat ccttgcaagt ctagcgcctg ctgccatgcg cgggcggcaa 10620 

gcgcgcgctc gtatgggttg agtgggggac cccatggcat ggggtgggtg agcgcggagg 10680 

cgtacatgcc gcaaatgtcg taaacgtaga ggggctctct gagtattcca agatatgtag 10740 

ggtagcatct tccaccgcgg atgctggcgc gcacgtaatc gtatagttcg tgcgagggag 10800 

cgaggaggtc gggaccgagg ttgctacggg cgggctgctc tgctcggaag actatctgcc 10860 

tgaagatggc atgtgagttg gatgatatgg ttggacgctg gaagacgttg aagctggcgt 10920 

ctgtgagacc taccgcgtca cgcacgaagg aggcgtagga gtcgcgcagc ttgttgacca 10980 

gctcggcggt gacctgcacg tctagggcgc agtagtccag ggtttccttg atgatgtcat 11040 

acttatcctg tccctttttt ttccacagct cgcggttgag gacaaactct tcgcggtctt 11100 

tccagtactc ttggatcgga aacccgtcgg cctccgaacg gtaagagcct agcatgtaga 11160 

actggttgac ggcctggtag gcgcagcatc ccttttctac gggtagcgcg tatgcctgcg 11220 

cggccttccg gagcgaggtg tgggtgagcg caaaggtgtc cctaaccatg actttgaggt 11280 

actggtattt gaagtcagtg tcgtcgcatc cgccctgctc ccagagcaaa aagtccgtgc 11340 

gctttttgga acgcgggttt ggcagggcga aggtgacatc gttgaagagt atctttcccg 11400 

cgcgaggcat aaagttgcgt gtgatgcgga agggtcccgg cacctcggaa cggttgttaa 11460 

ttacctgggc ggcgagcacg atctcgtcaa agccgttgat gttgtggccc acaatgtaaa 11520 

gttccaagaa gcgcgggatg cccttgatgg aaggcaattt tttaagttcc tcgtaggtga 11580 

gctcttcagg ggagctgagc ccgtgctctg aaagggccca gtctgcaaga tgagggttgg 11640 

aagcgacgaa tgagctccac aggtcacggg ccattagcat ttgcaggtgg tcgcgaaagg 11700 

tcctaaactg gcgacctatg gccatttttt ctggggtgat gcagtagaag gtaagcgggt 11760 

cttgttccca gcggtcccat ccaaggtccg cggctaggtc tcgcgcggcg gtcactagag 11820 

gctcatctcc gccgaacttc atgaccagca tgaagggcac gagctgcttc ccaaaggccc 11880 

ccatccaagt ataggtctct acatcgtagg tgacaaagag acgctcggtg cgaggatgcg 11940 

agccgatcgg gaagaactgg atctcccgcc accagttgga ggagtggctg ttgatgtggt 12000 

gaaagtagaa gtccctgcga cgggccgaac actcgtgctg gcttttgtaa aaacgtgcgc 12060 
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agtactggca 
caaggaagca 
cttcggctgc 
ccacgccgcg 
catcgcgcag 
gctcctgcag 
tgatttccag 
gcgcgactac 
ctaaaagcgg 
agggggcagg 
gctggcgaac 
gacgggcccg 
gacggcggcc 
ggccatgaac 
ggcggcgagg 
gttccagacg 
cgcgagattg 
gtagttgagg 
cgtggattcg 
ggcgaagttg 
gatgagctcg 
ttcttcaatc 
aggggggaca 
ctccccgcgg 
ttggaagacg 
ggatacggcg 
cctgagcgag 
acagtcgcaa 
tctggcggag 
cgacagaagc 
ccaggcttcg 
cggcacttct 
ggcggagttt 
catcggctga 
ctgcgtgagg 
gatggtgtaa 
gagctcggtg 
ccgcaccagg 
gcgtagggtg 
gatgtacctg 
gacgcggttc 
ggtcaggcgc 
cactcttccg 
gaaccccgga 
ggtgtgcgac 
tgctgcgcta 
gaaagcatta 
gggacccccg 
gtcatgcaag 
ttcccagatg 
agagcagcgg 
atccgcggct 
ctactcggac 
acacccaagg 
gtttcgcgac 



gcggtgcacg 
gagtgggaat 
ttgtccttga 
cgagcccaaa 
atgggagctg 
gtttacctcg 
gggctggttg 
ggtaccgcgc 
tgacgcgggc 
ggcacgtcgg 
gcgacgacgc 
gtgagcttga 
tggcgcaaaa 
tgctcgatct 
tcgttggaga 
cggctgtaga 
agctccacgt 
gtggtggcgg 
ttgatatccc 
aaaaactggg 
gcgacagtgt 
tcctcttcca 
cggcggcgac 
cgacggcgca 
ccgcccgtca 
ctaacgatgc 
tccgcatcga 
ggtaggctga 
gtgctgctga 
accatgtcct 
ttttgacatc 
tcttctcctt 
ggccgtaggt 
agcagggcca 
gtagactgga 
gtgcagttgg 
tacctgagac 
tactggtatc 
gccggggctc 
gacatccagg 
cagatgttgc 
gcgcagtcgt 
tggtctggtg 
tccggccgtc 
gtcagacaac 
gcttttttgg 
agtggctcgc 
gttcgagtct 
accccgcttg 
catccggtgc 
cagacatgca 
gacgcggcgg 
ctggaggagg 
gtgcagctga 
cgcgagggag 



ggctgtacat 
ttgagcccct 
ccgtctggct 
gtccagatgt 
tccatggtct 
catagccggg 
gtggcggcgt 
ggcgggcggt 
gggcccccgg 
cgccgcgcgc 
ggcggttgat 
acctgaaaga 
tctcctgcac 
cttcctcctg 
tgcgggccat 
ccacgccccc 
gccgggcgaa 
tgtgttctgc 
ccaaggcctc 
agttgcgcgc 
cgcgcacctc 
taagggcctc 
gacggcgcac 
tggtctcggt 
tgtcccggtt 
atctcaacaa 
ccggatcgga 
gcaccgtggc 
tgatgtaatt 
tgggtccggc 
ggcgcaggtc 
cctcttgtcc 
ggcgccctct 
ggtcggcgac 
agtcgtccat 
ccataacgga 
gcgagtaagc 
ccaccaaaaa 
cgggggcgag 
tgatgccggc 
gcagcggcaa 
tgacgctcta 
gataaattcg 
cgccgtgatc 
gggggagcgc 
ccactggccg 
tccctgtagc 
cgggccggcc 
caaattcctc 
tgcggcagat 
gggcaccctc 
cagatggtga 
gcgagggcct 
agcgtgacac 
aggagcccga 



cctgcacgag 
cgcctggcgg 
gctcgagggg 
ccgcgcgcgg 
ggagctcccg 
tcagggcgcg 
cgatggcttg 
gggccgcggg 
aggtaggggg 
gggcaggagc 
ctcctgaatc 
gagttcgaca 
gtctcctgag 
gagatctccg 
gagctgcgag 
ttcggcatcg 
gacggcgtag 
cacgaagaag 
aaggcgctcc 
cgacacggtt 
gcgctcaaag 
cccttcttct 
cgggaggcgg 
gacggcgcgg 
atgggttggc 
ttgttgtgta 
aaacctctcg 
gggcggcagc 
aaagtaggcg 
ctgctgaatg 
tttgtagtag 
tgcatctctt 
tcctcccatg 
aacgcgctcg 
gtccacaaag 
ccagttaacg 
ccttgagtca 
gtgcggcggc 
gtcttccaac 
ggcggtggtg 
aaagtgctcc 
gaccgtgcaa 
caagggtatc 
catgcggtta 
tccttttggc 
cgcgcggcgt 
cggagggtta 
ggactgcggc 
cggaaacagg 
gcgcccccct 
cccttctcct 
ttacgaaccc 
ggcgcggcta 
gcgcgaggcg 
ggagatgcgg 



gttgacctga 
gtttggctgg 
agttacggtg 
cggtcggagc 
cggcgtcagg 
ggctaggtcc 
caagaggccg 
ggtgtccttg 
ggctcgggac 
tggtgctgcg 
tggcgcctct 
gaatcaattt 
ttgtcttgat 
cgtccggctc 
aaggcgttga 
cgggcgcgca 
tttcgcaggc 
tacataaccc 
atggcctcgt 
aactcctcct 
gctacagggg 
tcttctggcg 
tcgacaaagc 
ccgttctcgc 

ggggggctgc 

ggtactccgc 
agaaaggcgt 
gggcggcggt 
gtcttgagac 
cgcaggcggt 
tcttgcatga 
gcatctatcg 
cgtgtgaccc 
gctaatatgg 
cggtggtatg 
gtctggtgac 
aagacgtagt 
ggctggcggt 
ataaggcgat 
gaggcgcgcg 
atggtcggga 
aaggagagcc 
atggcggacg 
ccgcccgcgt 
ttccttccag 
aagcggttag 
ttttccaagg 
gaacgggggt 
gacgagcccc 
cctcagcagc 
accgcgtcag 
ccgcggcgcc 
ggagcgccct 
tacgtgccgc 
gatcgaaagt 



cgaccgcgca 
tggtcttcta 
gatcggacca 
ttgatgacaa 
tcaggcggga 
aggtgatacc 
catccccgcg 
gatgatgcat 
ccgccgggag 
cgcggaggtt 
gcgtgaagac 
cggtgtcgtt 
aggcgatctc 
gctccacggt 
ggcctccctc 
tgaccacctg 
gctgaaagag 
agcgccgcaa 
agaagtccac 
ccagaagacg 
cctcttcttc 
gcggtggggg 
gctcgatcat 
gggggcgcag 
cgtgcggcag 
caccgaggga 
ctaaccagtc 
cggggttgtt 
ggcggatggt 
cggccatgcc 
gcctttctac 
ctgcggcggc 
cgaagcccct 
cctgctgcac 
cgcccgtgtt 
ccggctgcga 
cgttgcaagt 
agaggggcca 
gatatccgta 
gaaagtcacg 
cgctctggcc 
tgtaagcggg 
accggggttc 
gtcgaaccca 
gcgcggcgga 
gctggaaagc 
gttgagtcgc 
ttgcctcccc 
ttttttgctt 
ggcaagagca 
gaggggcaac 
ggacccggca 
ctcctgagcg 
ggcagaacct 
tccatgcagg 



12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 
14220 
14280 
14340 
14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
15300 
15360 
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gcgcgagttg 

cgacgcgcgg 

cgcgtacgag 

gcgcacgctt 

aagcgcgctg 

gcagcacagc 

gggccgctgg 

cttgagcctg 

ttacgcccgc 

ggggttctac 

tcgcaacgag 

cgagctgatg 

cgagtcctac 

ggcagctggg 

cgtggaggaa 

gatgtttctg 

agccagccgt 

tcgctgactg 

gcaattctgg 

atcgtaaacg 

gacgcgctgc 

cggctggtgg 

aacctgggct 

cggggacagg 

ccgcaaagtg 

ctgcagaccg 

gctcccacag 

ctgctgctaa 

cacttgctga 

caggagatta 

accctgaact 

agcgaggagg 

gacggggtaa 

tatgcctcaa 

gtgaaccccg 

ttctacaccg 

gacgacagcg 

gcagaggcgg 

gctgcggccc 

agcactcgca 

ctgcagccgc 

ctagtggaca 

ccgcgcccgc 

gatgactcgg 

caccttcgcc 

tcaccaaggc 

ggcgatgtat 

ggcggcggcg 

gtacctgcgg 

cgacaccacc 

ccagaacgac 

ggaggcaagc 

aaccatcctg 

ggcgcgggtg 

gtgggtggag 



cggcatggcc 

accgggatta 

cagacggtga 

gtggcgcgcg 

gagcaaaacc 

agggacaacg 

ctgctcgatt 

gctgacaagg 

aagatatacc 

atgcgcatgg 

cgcatccaca 

cacagcctgc 

tttgacgcgg 

gccggacctg 

tatgacgagg 

atcagatgat 

ccggccttaa 

cgcgcaaccc 

aagcggtggt 

cgctggccga 

ttcagcgcgt 

gggatgtgcg 

ccatggttgc 

aggactacac 

aggtgtatca 

taaacctgag 

gcgaccgcgc 

tagcgccctt 

cactgtaccg 

caagtgttag 

acctgctgac 

agcgcatttt 

cgcccagcgt 

accggccgtt 

agtatttcac 

ggggattcga 

tgttttcccc 

cgctgcgaaa 

cgcggtcaga 

ccacccgccc 

agcgcgaaaa 

agatgagtag 

ccacccgtcg 

cagacgacag 

ccaggctggg 

catggcaccg 

gaggaaggtc 

ctgggttcac 

cctaccgggg 

cgtgtgtacc 

cacagcaact 

acacagacca 

cataccaaca 

atggtgtcgc 

ttcacgctgc 



tgaaccgcga 
gtcccgcgcg 
accaggagat 
aggaggtggc 
caaatagcaa 
aggcattcag 
tgataaacat 
tggccgccat 
atacccctta 
cgctgaaggt 
aggccgtgag 
aaagggccct 
gcgctgacct 
ggctggcggt 
acgatgagta 
gcaagacgca 
ctccacggac 
tgacgcgttc 
cccggcgcgc 
aaacagggcc 
ggctcgttac 
cgaggccgtg 
actaaacgcc 
caactttgtg 
gtccgggcca 
ccaggctttc 
gaccgtgtct 
cacggacagt 
cgaggccata 
ccgcgcgctg 
caaccggcgg 
gcgctatgtg 
ggcgctggac 
tatcaatcgc 
caatgccatc 
ggtgcccgag 
gcaaccgcag 
ggaaagcttc 
tgctagtagc 
gcgcctgctg 
gaacctgcct 
atggaagacg 
tcaaaggcac 
cagcgtcttg 
gagaatgttt 
agcgttggtt 
ctcctccctc 
ccttcgatgc 
ggagaaacag 
ttgtggacaa 
ttctaaccac 
tcaatcttga 
tgccaaatgt 
gctcgcttac 
ccgagggcaa 



gcggttgctg 
cgcacacgtg 
taactttcaa 
tataggactg 
gccgctcatg 
ggatgcgctg 
tctgcagagc 
taactattcc 
cgttcccata 
gcttaccttg 
cgtgagccgg 
ggctggcacg 
gcgctgggcc 
ggcacccgcg 
cgagccagag 
acggacccgg 
gactggcgcc 
cggcagcagc 
gcaaacccca 
atccggcccg 
aacagcagca 
gcgcagcgtg 
ttcctgagta 
agcgcactgc 
gactattttt 
aagaacttgc 
agcttgctga 
ggcagcgtgt 
ggtcaggcgc 
gggcaggagg 
caaaaaatcc 
cagcagagcg 
atgaccgcgc 
ctaatggact 
ttgaacccgc 
ggtaacgatg 
accctgctag 
cgcaggccaa 
ccatttccaa 
ggcgaggagg 
ccggcgtttc 
tatgcgcagg 
gaccgtcagc 
gatttgggag 
taaaaaaaag 
ttcttgtatt 
ctacgagagc 
tcccctggac 
catccgttac 
caagtcaacg 
ggtcattcaa 
cgaccggtcg 
gaacgagttc 
taaggacaaa 
ctactccgag 



cgcgaggagg 
gcggccgccg 
aaaagcttta 
atgcatctgt 
gcgcagctgt 
ctaaacatag 
atagtggtgc 
atgctcagtc 
gacaaggagg 
agcgacgacc 
cggcgcgagc 
ggcagcggcg 
ccaagccgac 
cgcgctggca 
gacggcgagt 
cggtgcgggc 
aggtcatgga 
cgcaggccaa 
cgcacgagaa 
atgaggccgg 
acgtgcagac 
agcgcgcgca 
cacagcccgc 
ggctaatggt 
tccagaccag 
aggggctgtg 
cgcccaactc 
cccgggacac 
atgtggacga 
acacgggcag 
cctcgttgca 
tgagccttaa 
gcaacatgga 
acttgcatcg 
actggctacc 
gattcctctg 
agttgcaaca 
gcagcttgtc 
gcttgatagg 
agtacctaaa 
ccaacaacgg 
agcacaggga 
ggggtctggt 
ggagtggcaa 
catgatgcaa 
ccccttagta 
gtggtgagcg 
ccgccgttcg 
tctgagttgg 
gatgtggcat 
aacaatgact 
cactggggcg 
atgtttacca 
caggtggagc 
accatgacca 



actttgagcc 
acctggtaac 
acaaccacgt 
gggactttgt 
tccttatagt 
tagagcccga 
aggagcgcag 
tgggcaagtt 
taaagatcga 
tgggcgttta 
tcagcgaccg 
atagagaggc 
gcgccctgga 
acgtcggcgg 
actaagcggt 
ggcgctgcag 
ccgcatcatg 
ccggctctcc 
ggtgctggcg 
cctggtctac 
caacctggac 
gcagcagggc 
caacgtgccg 
gactgagaca 
tagacaaggc 

gggggtgcgg 

gcgcctgttg 
atacctaggt 
gcatactttc 
cctggaggca 
cagtttaaac 
cctgatgcgc 
accgggcatg 
cgcggccgcc 
gccccctggt 
ggacgacata 
acgcgagcag 
cgatctaggc 
gtctcttacc 
caactcgctg 
gatagagagc 
tgtgcccggc 
gtgggaggac 
cccgtttgca 
aataaaaaac 
tgcggcgcgc 
cggcgccagt 
tgcctccgcg 
cacccctatt 
ccctgaacta 
acagcccggg 
gcgacctgaa 
ataagtttaa 
tgaaatacga 
tagaccttat 



15420 

15480 

15540 

15600 

15660 

15720 

15780 

15840 

15900 

15960 

16020 

16080 

16140 

16200 

16260 

16320 

16380 

16440 

16500 

16560 

16620 

16680 

16740 

16800 

16860 

16920 

16980 

17040 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 

17580 

17640 

17700 

17760 

17820 

17880 

17940 

18000 

18060 

18120 

18180 

18240 

18300 

18360 

18420 

18480 

18540 

18600 

18660 
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gaacaacgcg atcgtggagc actacttgaa agtgggcagg cagaacgggg ttctggaaag 18720 

cgacatcggg gtaaagtttg acacccgcaa cttcagactg gggtttgacc cagtcactgg 18780 

tcttgtcatg cctggggtat atacaaacga agccttccat ccagacatca ttttgctgcc 18840 

aggatgcggg gtggacttca cccacagccg cctgagcaac ttgttgggca tccgcaagcg 18900 

gcaacccttc caggagggct ttaggatcac ctacgatgac ctggagggtg gtaacattcc 18960 

cgcactgttg gatgtggacg cctaccaggc aagcttgaaa gatgacaccg aacagggcgg 19020 

gggtggcgca ggcggcggca acaacagtgg cagcggcgcg gaagagaact ccaacgcggc 19080 

agctgcggca atgcagccgg tggaggacat gaacgatcat gccattcgcg gcgacacctt 19140 

tgccacacgg gcggaggaga agcgcgctga ggccgaggca gcggccgaag ctgccgcccc 19200 

cgctgcggag gctgcacaac ccgaggtcga gaagcctcag aagaaaccgg tgattaaacc 19260 

cctgacagag gacagcaaga aacgcagtta caacctaata agcaatgaca gcaccttcac 19320 

ccagtaccgc agctggtacc ttgcatacaa ctacggcgac cctcaggccg ggatccgctc 19380 

atggaccctg ctttgcactc ctgacgtaac ctgcggctcg gagcaggtat actggtcgtt 19440 

gcccgacatg atgcaagacc ccgtgacctt ccgctccacg cgccagatca gcaactttcc 19500 

ggtggtgggc gccgagctgt tgcccgtgca ctccaagagc ttctacaacg accaggccgt 19560 

ctactcccag ctcatccgcc agtttacctc tctgacccac gtgttcaatc gctttcccga 19620 

gaaccagatt ttggcgcgcc cgccagcccc caccatcacc accgtcagtg aaaacgttcc 19680 

tgctctcaca gatcacggga cgctaccgct gcgcaacagc atcggaggag tccagcgagt 19740 

gaccattact gacgccagac gccgcacctg cccctacgtt tacaaggccc tgggcatagt 19800 

ctcgccgcgc gtcctatcga gccgcacttt ttgagcaagc atgtccatcc ttatatcgcc 19860 

cagcaataac acaggctggg gcctgcgctt cccaagcaag atgtttggcg gggccaagaa 19920 

gcgctccgac caacacccag tgcgcgtgcg cgggcactac cgcgcgccct ggggcgcgca 19980 

caaacgcggc cgcactgggc gcaccaccgt cgatgacgcc atcgacgcgg tggtggagga 20040 

ggcgcgcaac tacacgccca cgccgccgcc agtgtccacc gtggacgcgg ccattcagac 20100 

cgtggtgcgc ggagcccggc gctacgctaa aatgaagaga cggcggaggc gcgtagcacg 20160 

tcgccaccgc cgccgacccg gcactgccgc ccaacgcgcg gcggcggccc tgcttaaccg 20220 

cgcacgtcgc accggccgac gggcggccat gcgagccgct cgaaggctgg ccgcgggtat 20280 

tgtcactgtg ccccccaggt ccaggcgacg agcggccgcc gcagcagccg cggccattag 20340 

tgctatgact cagggtcgca ggggcaacgt gtactgggtg cgcgactcgg ttagcggcct 20400 

gcgcgtgccc gtgcgcaccc gccccccgcg caactagatt gcaataaaaa actacttaga 20460 

ctcgtactgt tgtatgtatc cagcggcggc ggcgcgcatc gaagctatgt ccaagcgcaa 20520 

aatcaaagaa gagatgctcc aggtcatcgc gccggagatc tatggccccc cgaagaagga 20580 

agagcaggat tacaagcccc gaaagctaaa gcgggtcaaa aagaaaaaga aagatgatga 20640 

tgatgatgaa cttgacgacg aggtggaact gttgcacgcg accgcgccca ggcgacgggt 20700 

acagtggaaa ggtcgacgcg taagacgtgt tttgcgaccc ggcaccaccg tagtctttac 20760 

gcccggtgag cgctccaccc gcacctacaa gcgcgtgtat gatgaggtgt acggcgacga 20820 

ggacctgctt gagcaggcca acgagcgcct cggggagttt gcctacggaa agcggcataa 20880 

ggacatgctg gcgttgccgc tggacgaggg caacccaaca cctagcctaa agcccgtgac 20940 

actgcagcag gtgctgcccg cgcttgcacc gtccgaagaa aagcgcggcc taaagcgcga 21000 

gtctggtgac ttggcaccca ccgtgcagct gatggtaccc aagcgtcagc gactggaaga 21060 

tgtcttggaa aaaatgaccg tggagcctgg gctggagccc gaggtccgcg tgcggccaat 21120 

caagcaggtg gcaccgggac tgggcgtgca gaccgtggac gttcagatac ccaccaccag 21180 

tagcactagt attgccactg ccacagaggg catggagaca caaacgtccc cggttgcctc 21240 

ggcggtggca gatgccgcgg tgcaggcggc cgctgcggcc gcgtccaaga cctctacgga 21300 

ggtgcaaacg gacccgtgga tgtttcgtgt ttcagccccc cggcgtccgc gccgttcaag 21360 

gaagtacggc gccgccagcg cgctactgcc cgaatatgcc ctacatcctt ccatcgcgcc 21420 

tacccccggc tatcgtggct acacctaccg ccccagaaga cgagcaacta cccgacgccg 21480 

aaccaccact ggaacccgcc gccgccgtcg ccgtcgccag cccgtgctgg ccccgatttc 21540 

cgtgcgcagg gtggctcgcg aaggaggcag gaccctggtg ctgccaacag cgcgctacca 21600 

ccccagcatc gtttaaaagc cggtctttgt ggttcttgca gatatggccc tcacctgccg 21660 

cctccgtttc ccggtgccgg gattccgagg aagaatgcac cgtaggaggg gcatggccgg 21720 

ccacggcccg acgggcggca tgcgtcgtgc gcaccaccgg cggcggcgcg cgtcgcaccg 21780 

tcgcatgcgc ggcggtatcc tgcccctcct tattccactg atcgccgcgg cgattggcgc 21840 

cgtgcccgga attgcatccg tggccttgca ggcgcagaga cactgattaa aaacaagtta 21900 

catgtggaaa aatcaaaata aaagtctgga ctctcacgct cgcttggtcc tgtaactatt 21960 
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ttgtagaatg 
catgggaaac 
ctcgctgtgg 
ctggaacagc 
aaaggtggta 
agtgcaaaat 
ggccgtggag 
agaaactctg 
cctgcccacc 
cgtaacgctg 
gtccgccgtt 
gcgatcgttg 
tttgggggtg 
tgtcatgtat 
ccaagatggc 
acgcctcgga 
tcagcctgaa 
accggtctca 
cgtacaaggc 
cgtactttga 
ctgcctacaa 
aaactgcaca 
aggcgcgaga 
gaataaaaat 
ccggcaaaga 
aatggaacga 
tgaaaccctg 
tggttgaaca 
caaatgccac 
taaacatgga 
ccaaagtcat 
acaattttat 
aggcatcgca 
agcttttgct 
ttgacagcta 
caaattattg 
caactgctgc 
aacgcaatga 
tatggagaaa 
accccaccaa 
tggtggctcc 
acatggacaa 
tgttgttggg 
ccattaaaaa 
atgttaacat 
ttaagtttga 
ccacgctgga 
ccgccgccaa 
catcgcgcaa 
ccccttccct 
ttgacggaac 
ctgttagctg 
cagttgacgg 
tgcagatgtt 
aagaccgcat 



gaagacatca 
tggcaagata 
agcggcatta 
agcacaggcc 
gatggcctgg 
aagattaaca 
acagtgtctc 
gtgacgcaaa 
acccgtccca 
gacctgcctc 
gttgtaaccc 
cggcccgtag 
caatccctga 
gcgtccatgt 
taccccttcg 
gtacctgagc 
taacaagttt 
gcgtttgacg 
gcggttcacc 
catccgcggc 
cgcactggcc 
agtggatgct 
acaggaacaa 
aactaaagaa 
aattttcgca 
agcggatgcc 
ctatggctca 
aaatggtaaa 
aaatgaagtt 
aactccagat 
gcttggacaa 
tggtctcatg 
gttgaacgct 
tgattcaatt 
tgatccagat 
ctttcctctt 
taacggggac 
aataggggtg 
tttcctttac 
tgtggaaata 
tgggcttgta 
cgttaatccc 
aaacggccgc 
cctcctcctc 
ggttctgcag 
cagcatttgt 
agccatgctc 
catgctatat 
ctgggcagca 
gggatcaggc 
cttctatctt 
gccgggcaac 
ggagggctat 
ggccaactac 
gtactcgttc 



actttgcgtc 

tcggcaccag 

aaaatttcgg 

agatgctgag 

cctctggcat 

gtaagcttga 

cagaggggcg 

tagacgagcc 

tcgcgcccat 

cccccgccga 

gtcctagccg 

ccagtggcaa 

agcgccgacg 

cgccgccaga 

atgatgccgc 

cccgggctgg 

agaaacccca 

ctgcggttca 

ctagctgtgg 

gtgctggaca 

cccaagggtg 

caagaacttg 

gctaagaaaa 

ggtctacaaa 

gacaaaactt 

acagcagctg 

tacgctagac 

ttggaaagtc 

aacaatatac 

actcatcttt 

caagcaatgc 

tattacaaca 

gttgtagatt 

ggcgacagaa 

gtcagaatta 

ggtggaattg 

caaggcaata 

ggaaataact 

tccaatattg 

tctgacaacc 

gactgctaca 

tttaaccacc 

tacgtgccct 

ctgccaggct 

agctctctgg 

ctttacgcca 

agaaatgaca 

cccatacccg 

tttcgcggtt 

tacgaccctt 

aatcacacct 

gaccgcctgc 

aacgtagctc 

aatattggct 

ttcagaaact 



actggccccg 

caatatgagc 

ttccgccgtt 

ggacaagttg 

tagcggggtg 

tccccgccct 

tggcgaaaag 

tccctcgtac 

ggctaccgga 

cacccagcag 

cgcgtccctg 

ctggcaaagc 

atgcttctga 

ggagctgctg 

agtggtctta 

tgcagttcgc 

cggtggcgcc 

tccccgtgga 

gtgataaccg 

ggggccctac 

cccccaactc 

acgaagagga 

cccatgtata 

taggaactgc 

ttcaacctga 

gtggaagggt 

ccaccaattc 

aagtcgaaat 

aaccaacagt 

cttataaacc 

caaacagacc 

gcacaggtaa 

tgcaagacag 

caagatactt 

ttgagaacca 

ggattactga 

ctacctggca 

ttgccatgga 

cgctgtacct 

ccaacaccta 

ttaaccttgg 

accgcaatgc 

ttcacattca 

catacacata 

gaaacgacct 

ccttcttccc 

ccaacgacca 

ccaacgccac 

gggccttcac 

actacaccta 

ttaagaaggt 

ttactcccaa 

agtgcaacat 

accagggctt 

tccagcccat 



cgacacggct 

ggtggcgcct 

aagaactatg 

aaagagcaaa 

gtggacctgg 

cccgtagagg 

cgtccgcgac 

gaggaggcac 

gtgctgggcc 

aaacctgtgc 

cgccgcgccg 

acactgaaca 

tagctaacgt 

agccgccgcg 

catgcacatc 

ccgcgccacc 

tacgcacgac 

ccgcgaggat 

tgtgctagac 

ttttaagccc 

gtgcgagtgg 

gaatgaagcc 

tgcccaggct 

cgacgccaca 

accacaagta 

tcttaaaaag 

caacggcgga 

gcaatttttt 

tgtattgtac 

taaaatgggg 

aaattacatt 

catgggtgtc 

aaacacagag 

ttcaatgtgg 

tggaactgag 

cacttttcaa 

aaaagattca 

aattaacctg 

gccagacaag 

cgactacatg 

ggcgcgctgg 

gggcctgcgt 

ggtgccccaa 

tgaatggaac 

tagagttgac 

catggcccac 

gtcctttaat 

caacgtgccc 

acgcttgaag 

ctctggctcc 

ggccattact 

tgagtttgag 

gacaaaggac 

ctacattcca 

gagccggcaa 



cgcgcccgtt 

tcagctgggg 

gcagcaaagc 

atttccaaca 

ccaaccaggc 

agcctccacc 

ccgacaggga 

taaagcaagg 

agcacacacc 

tgccaggccc 

ccagcggtcc 

gcatcgtggg 

gtcgtatgtg 

cgcccgcttt 

tcgggccagg 

gagacgtact 

gtgaccacag 

actgcgtact 

atggcttcca 

tactctggca 

gaacaaaatg 

aatgaagctc 

ccactgtccg 

gtagcaggtg 

ggagaatctc 

acaactccca 

cagggcgtta 

tccacatcca 

agcgaagatg 

gataaaaatg 

gcttttagag 

cttgctggtc 

ctgtcctacc 

aatcaagctg 

gatgagttgc 

gctgttaaaa 

acatttgcag 

aatgccaacc 

ctaaaataca 

aacaagcgag 

tctctggact 

taccgctcca 

aagttttttg 

ttcaggaagg 

ggggctagca 

aacacggcct 

gactaccttt 

atctccatcc 

acaaaggaaa 

ataccatacc 

tttgactctt 

attaagcgct 

tggttcctag 

gaaagctaca 

gtggtggacg 



22020 

22080 

22140 

22200 

22260 

22320 

22380 

22440 

22500 

22560 

22620 

22680 

22740 

22800 

22860 

22920 

22980 

23040 

23100 

23160 

23220 

23280 

23340 

23400 

23460 

23520 

23580 

23640 

23700 

23760 

23820 

23880 

23940 

24000 

24060 

24120 

24180 

24240 

24300 

24360 

24420 

24480 

24540 

24600 

24660 

24720 

24780 

24840 

24900 

24960 

25020 

25080 

25140 

25200 

25260 
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atactaaata caaagattat cagcaggttg gaattatcca ccagcataac aactcaggct 25320 

tcgtaggcta cctcgctccc accatgcgcg agggacaagc ttaccccgct aatgttccct 25380 

acccactaat aggcaaaacc gcggttgata gtattaccca gaaaaagttt ctttgcgacc 25440 

gcaccctgtg gcgcatcccc ttctccagta actttatgtc catgggtgcg ctcacagacc 25500 

tgggccaaaa ccttctctac gcaaactccg cccacgcgct agacatgacc tttgaggtgg 25560 

atcccatgga cgagcccacc cttctttatg ttttgtttga agtctttgac gtggtccgtg 25620 

tgcaccagcc gcaccgcggc gtcatcgaga ccgtgtacct gcgcacgccc ttctcggccg 25680 

gcaacgccac aacataaaga agcaagcaac atcaacaaca gctgccgcca tgggctccag 25740 

tgagcaggaa ctgaaagcca ttgtcaaaga tcttggttgt gggccatatt ttttgggcac 25800 

ctatgacaag cgcttcccag gctttgtttc cccacacaag ctcgcctgcg ccatagttaa 25860 

cacggccggt cgcgagactg ggggcgtaca ctggatggcc tttgcctgga acccgcgctc 25920 

aaaaacatgc tacctctttg agccctttgg cttttctgac caacgtctca agcaggttta 25980 

ccagtttgag tacgagtcac tcctgcgccg tagcgccatt gcctcttccc ccgaccgctg 26040 

tataacgctg gaaaagtcca cccaaagcgt gcaggggccc aactcggccg cctgtggcct 26100 

attctgctgc atgtttctcc acgcctttgc caactggccc caaactccca tggatcacaa 26160 

ccccaccatg aaccttatta ccggggtacc caactccatg cttaacagtc cccaggtaca 26220 

gcccaccctg cgccgcaacc aggaacagct ctacagcttc ctggagcgcc actcgcccta 26280 

cttccgcagc cacagtgcgc aaattaggag cgccacttct ttttgtcact tgaaaaacat 26340 

gtaaaaataa tgtactagga gacactttca ataaaggcaa atgtttttat ttgtacactc 26400 

tcgggtgatt atttaccccc acccttgccg tctgcgccgt ttaaaaatca aaggggttct 26460 

gccgcgcatc gctatgcgcc actggcaggg acacgttgcg atactggtgt ttagtgctcc 26520 

acttaaactc aggcacaacc atccgcggca gctcggtgaa gttttcactc cacaggctgc 26580 

gcaccatcac caacgcgttt agcaggtcgg gcgccgatat cttgaagtcg cagttggggc 26640 

ctccgccctg cgcgcgcgag ttgcgataca cagggttaca gcactggaac actatcagcg 26700 

ccgggtggtg cacgctggcc agcacgctct tgtcggagat cagatccgcg tccaggtcct 26760 

ccgcgttgct cagggcgaac ggagtcaact ttggtagctg ccttcccaaa aagggtgcat 2 6820 

gcccaggctt tgagttgcac tcgcaccgta gtggcatcag aaggtgaccg tgcccagtct 26880 

gggcgttagg atacagcgcc tgcatgaaag ccttgatctg cttaaaagcc acctgagcct 26940 

ttgcgccttc agagaagaac atgccgcaag acttgccgga aaactgattg gccggacagg 27000 

ccgcgtcatg cacgcagcac cttgcgtcgg tgttggagat ctgcaccaca tttcggcccc 27060 

accggttctt cacgatcttg gccttgctag actgctcctt cagcgcgcgc tgcccgtttt 27120 

cgctcgtcac atccatttca atcacgtgct ccttatttat cataatgctc ccgtgtagac 27180 

acttaagctc gccttcgatc tcagcgcagc ggtgcagcca caacgcgcag cccgtgggct 27240 

cgtggtgctt gtaggttacc tctgcaaacg actgcaggta cgcctgcagg aatcgcccca 27300 

tcatcgtcac aaaggtcttg ttgctggtga aggtcagctg caacccgcgg tgctcctcgt 27360 

ttagccaggt cttgcatacg gccgccagag cttccacttg gtcaggcagt agcttgaagt 27420 

ttgcctttag atcgttatcc acgtggtact tgtccatcaa cgcgcgcgca gcctccatgc 27480 

ccttctccca cgcagacacg atcggcaggc tcagcgggtt tatcaccgtg ctttcacttt 27540 

ccgcttcact ggactcttcc ttttcctctt gcatccgcat accccgcgcc actgggtcgt 27600 

cttcattcag ccgccgcacc gtgcgcttac ctcccttgcc gtgcttgatt agcaccggtg 27660 

ggttgctgaa acccaccatt tgtagcgcca catcttctct ttcttcctcg ctgtccacga 27720 

tcacctctgg ggatggcggg cgctcgggct tgggagaggg gcgcttcttt ttctttttgg 27780 

acgcaatggc caaatccgcc gtcgaggtcg atggccgcgg gctgggtgtg cgcggcacca 27840 

gcgcatcttg tgacgagtct tcttcgtcct cggactcgag acgccgcctc agccgctttt 27900 

ttgggggcgc gcggggaggc ggcggcgacg gcgacgggga cgagacgtcc tccatggttg 27960 

gtggacgtcg cgccgcaccg cgtccgcgct cgggggtggt ttcgcgctgc tcctcttccc 28020 

gactggccat ttccttctcc tataggcaga aaaagatcat ggagtcagtc gagaaggagg 28080 

acagcctaac cgcccccttt gagttcgcca ccaccgcctc caccgatgcc gccaacgcgc 28140 

ctaccacctt ccccgtcgag gcacccccgc ttgaggagga ggaagtgatt atcgagcagg 28200 

acccaggttt tgtaagcgaa gacgacgaag atcgctcagt accaacagag gataaaaagc 28260 

aagaccagga cgacgcagag gcaaacgagg aacaagtcgg gcggggggac caaaggcatg 28320 

gcgactacct agatgtggga gacgacgtgc tgttgaagca tctgcagcgc cagtgcgcca 28380 

ttatctgcga cgcgttgcaa gagcgcagcg atgtgcccct cgccatagcg gatgtcagcc 28440 

ttgcctacga acgccacctg ttctcaccgc gcgtaccccc caaacgccaa gaaaacggca 28500 

catgcgagcc caacccgcgc ctcaacttct accccgtatt tgccgtgcca gaggtgcttg 28560 
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ccacctatca 
gccgagcgga 
tcgacgaagt 
ctctgcaaca 
gtgacaacgc 
cggcacttaa 
gtgcacgacc 
cagttggcga 
agcgacgcaa 
ggttctttgc 
gccagggcta 
cctaccttgg 
agggcgaggc 
ggcaaacggc 
agaagctgct 
ccgcgcacct 
tgccagactt 
caggaattct 
gtgaatgccc 
cctaccactc 
gctgcaacct 
gtcaaattat 
cggggttgaa 
aggactacca 
agcttaccgc 
aagcccgcca 
gcgaggagct 
cttcccagga 
gaggaatact 
gactgggaca 
tcaccctcgg 
gctacaacct 
tgggacacca 
caacaacagc 
ttgcaagact 
gtggccttcc 
ggcggcagcg 
tctgacaaag 
ggcgcccaac 
tgctatattt 
gcgctccctc 
ggaagacgcg 
cgccctttct 
agcacctgtc 
ccagccacaa 
catgagcgcg 
aattctcctc 
ttggcccgct 
agacgcccag 
tcacagggtg 
tcagctcaac 
gatcggcggc 
ctcgtcctcg 
gccttcggtt 
tcccaacttt 



catctttttc 
caagcagctg 
gccaaaaatc 
agaaaacagc 
gcgcctagcc 
cctacccccc 
cctggagagg 
tgagcagctg 
gctaatgatg 
tgacccggag 
cgtgcgccag 
aattttgcac 
gcgccgcgac 
catgggcgtg 
aaagcaaaac 
ggcggacatt 
caccagtcaa 
gcccgccacc 
tccgccgctt 
cgacatcatg 
atgcaccccg 
cggtaccttt 
actcactccg 
cgcccacgag 
ctgcgtcatt 
agagtttctg 
caacccaatc 
tggcacccaa 
gggacagtca 
gcctagacga 
tcgcattccc 
ccgctcctca 
ctggaaccag 
gccaaggcta 
gtgggggcaa 
cccgtaacat 
gcagcaacag 
cccaagaaat 
gaacccgtat 
caacagagca 
acccgcagct 
gaggctctct 
caaatttaag 
gtcagcgcca 
atgggacttg 
ggaccccaca 
gaacaggcgg 
gccctggtgt 
gccgaagttc 
cggtcgcccg 
gacgagtcgg 
gctggccgct 
gagccgcgct 
tacttcaacc 
gacgcggtaa 



caaaactgca 
gccttgcggc 
tttgagggtc 
gaaaatgaaa 
gtgctgaaac 
aaggttatga 
gatgcaaact 
gcgcgctggc 
gccgcagtgc 
atgcagcgca 
gcctgcaaaa 
gaaaaccgcc 
tacgtccgcg 
tggcagcagt 
ttgaaggacc 
atcttccccg 
agcatgttgc 
tgctgtgcgc 
tggggtcact 
gaagacgtga 
caccgctccc 
gagctgcagg 
gggctgtgga 
attaggttct 
acccagggcc 
ctacgaaagg 
cccccgccgc 
aaagaagctg 
ggcagaggag 
ggaagcttcc 
ctcgccggcg 
ggcgccgccg 
ggccggtaag 
ccgctcgtgg 
catctccttc 
cctgcattac 
cagcggccac 
ccacagcggc 
cgacccgcga 
ggggccaaga 
gcctgtatca 
tcagcaaata 
cgcgaaaact 
ttatgagcaa 
cggctggagc 
tgatatcccg 
ctattaccac 
accaggaaag 
agatgactaa 
ggcagggtat 
tgagctcctc 
cttcatttac 
ccggaggcat 
ccttttctgg 
aagactcggc 



agatacccct 
agggcgctgt 
ttggacgcga 
gtcactgtgg 
gcagcatcga 
gcacagtcat 
tgcaagaaca 
ttgagacgcg 
ttgttaccgt 
agctagagga 
tttccaacgt 
ttgggcaaaa 
actgcgttta 
gcctggagga 
tatggacggc 
aacgcctgct 
aaaactttag 
ttcctagcga 
gctaccttct 
gcggtgacgg 
tggtctgcaa 
gtccctcgcc 
cgtcggctta 
acgaagacca 
acatccttgg 
gacggggggt 
cgcagcccta 
cagctgccgc 
gttttggacg 
gaggccgaag 
ccccagaaat 
gcactgcccg 
tctaagcagc 
cgcgtgcaca 
gcccgccgct 
taccgtcatc 
gcagaagcaa 
ggcagcagca 
gcttagaaac 
acaagagctg 
caaaagcgaa 
ctgcgcgctg 
acgtcatctc 
ggaaattccc 
tgcccaagac 
ggtcaacgga 
cacacctcgt 
tcccgctccc 
ctcaggggcg 
aactcacctg 
tcttggtctc 
gccccgtcag 
tggaactcta 
acctcccggc 
ggacggctac 



atcctgccgt 
catacctgat 
cgagaagcgc 
agtgctggtg 
ggtcacccac 
gagcgagctg 
aaccgaggag 
cgagcctgcc 
ggagcttgag 
aacgttgcac 
ggagctctgc 
cgtgcttcat 
cttatttctg 
gcgcaacctg 
cttcaacgag 
taaaaccctg 
gaactttatc 
ctttgtgccc 
gcagctagcc 
cctactggag 
ttcacaactg 
tgacgaaaag 
ccttcgcaaa 
atcccgcccg 
ccaattgcaa 
ttacttggac 
tcagcagccg 
cgccgccacc 
aggaggagga 
aggtgtcaga 
cggcaaccgt 
ttcgccgacc 
cgccgccgtt 
agaacgccat 
ttcttctcta 
tctacagccc 
aggcgaccgg 
ggaggaggag 
aggatttttc 
aaaataaaaa 
gatcagcttc 
actcttaagg 
cagcggccac 
acgccctaca 
tactcaaccc 



atccgcgccc 
aataacctta 
accactgtgg 
cagcttgcgg 
aaaatcagag 
cgtccggacg 
gcgatcctaa 
caatttattg 
cactacccgg 
gactgaatga 



gccaaccgca 
atcgcctcgc 
gcggcaaacg 
gaacttgagg 
tttgcctacc 
atcgtgcgcc 
ggcctacccg 
gacttggagg 
tgcatgcagc 
tacacctttc 
aacctggtct 
tccacgctca 
tgctacacct 
aaggagctgc 
cgctccgtgg 
caacagggtc 
ctagagcgtt 
attaagtacc 
aactaccttg 
tgtcactgtc 
cttagcgaaa 
tccgcggctc 
tttgtacctg 
ccaaatgcgg 
gccattaaca 
ccccagtccg 
cgggcccttg 
cacggacgag 
gatgatggaa 
cgaaacaccg 
tcccagcatt 
caaccgtaga 
agcccaagag 
agttgcttgc 
ccatcacggc 
ctactgcacc 
atagcaagac 
cactgcgtct 
ccactctgta 
acaggtctct 
ggcgcacgct 
actagtttcg 
acccggcgcc 
tgtggagtta 
gaataaacta 
accgaaaccg 
atccccgtag 
tacttcccag 
gcggctttcg 
ggcgaggtat 
ggacatttca 
ctctgcagac 
aggagttcgt 
accagtttat 
ccagtggaga 



28620 

28680 

28740 

28800 

28860 

28920 

28980 

29040 

29100 

29160 

29220 

29280 

29340 

29400 

29460 

29520 

29580 

29640 

29700 

29760 

29820 

29880 

29940 

30000 

30060 

30120 

30180 

30240 

30300 

30360 

30420 

30480 

30540 

30600 

30660 

30720 

30780 

30840 

30900 

30960 

31020 

31080 

31140 

31200 

31260 

31320 

31380 

31440 

31500 

31560 

31620 

31680 

31740 

31800 

31860 
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ggcagagcaa ctgcgcctga cacacctcga 
cggctccggt gagttttgtt actttgaatt 
cggcgtccgg ctcaccaccc aggtagagct 
gcgccccctg ctagtggagc gggagcgggg 
tcctaaccct ggattacatc aagatcttat 
attacttact taaaatcagt cagcaaatct 
ccctcctccc aactctggta tttcagcagc 
aatgggatgt caaattcctc atgttcttgt 
cagatgaaac gcgccagacc gtctgaagac 
gaaaccggcc ctccaactgt gcctttcctt 
caagaaagtc cccccggagt gctttctttg 
ggcatgcttg cgctaaaaat gggcagcggc 
tcaaatacaa tcactgtttc tcaaccgcta 
acatccgcgc cccttacagt cagctcaggc 
gtggtctctg acaacactct taccatgcaa 
aaacttagca ttgctaccaa agagccactt 
acatcagccc ccctctctgc cactgataac 
cttactactg caaatggtag tctggctgtt 
ggaaaacttg ggctcaaaat tggcggtcct 
acactaggta ctggtcaggg ggttgcagtt 
ggcgcaatag ggtttgatac atctggcaac 
gtggatagcg ccggtcctaa ccaaaaacta 
gcttttgaca acaccgcaat aacaattaac 
tcctcaaacg gaaatcccat aaaaacaaaa 
ggagctatgg ttgcaaaact tggaacaggc 
atgggcagca taaacaatga cagacttact 
tgcagaattg cttcagataa agactgcaag 
caaattttgg gcactgtttc agctttggca 
actctaagca gtgtaaactt ggttcttaga 
tcatcactgg acaaacagta ttggaacttt 
tacacttatg ctgttgggtt tatgccaaac 
actgcaaaaa gtaatattgt tagccaggtg 
cattttacta ttacgctaaa tggaacagat 
tcattcagtt ggtcctggaa cagtggacaa 
tataccttct cctacattgc ccaggaataa 
caacgtgttt atttttcaat tgcagaaaat 
cccaccacca catagcttat actaatcacc 
attcaacctg ccacctccct cccaacacac 
cttaaacagc atcatatcat gggtaacaga 
ctcctgtcga gccaaacgct catcagtgat 
gttcatgtcg ctgtccagct gctgagccac 
gggcggcgaa ggagaagtcc acgcctacat 
agggcggtgg tgctgcagca gcgcgcgaat 
ggaatacaac atggcagtgg tctcctcagc 
ccttgtcctc cgggcacagc agcgcaccct 
gcacagtacc acaatattgt ttaaaatccc 
ggcggggacc acagaaccca cgtggccatc 
acccctcata aacacgctgg acataaacat 
ctcccggtac catataaacc tctgattaaa 
gctggccaaa acctgcccgc cggctatgca 
gtggagagcc caggactcgt aaccatggat 
acaacacagg cacacgtgca tacacttcct 
catatcccag ggaacaaccc attcctgaat 
tcgcacgtaa ctcacgttgt gcattgtcaa 
ctccagtatg gtagcgcggg tttctgtctc 



ccactgccgc cgccacaagt gctttgcccg 31920 

gcccgaagag catatcgagg gcccggcgca 31980 

tacacgtagc ctgattcggg agtttaccaa 32040 

tccctgtgtt ctgaccgtgg tttgcaactg 32100 

tccattcaac taacaataaa cacacaataa 32160 

ttgtccagct tattcagcat cacctccttt 32220 

cttttagctg cgaactttct ccaaagtcta . 32280 

ccctccgcac ccactatctt catattgttg 32340 

accttcaacc ctgtgtaccc atatgacacg 32400 

acccctccct ttgtgtcgcc aaatgggttc 32460 

cgtctttcag aacctttggt tacctcacac 32520 

ctgtccctgg atcaggcagg caaccttaca 32580 

aaaaaaacaa agtccaatat aactttggaa 32640 

gccctaacca tggccacaac ttcgcctttg 32700 

tcacaagcac cgctaaccgt gcaagactca 32760 

acagtgttag atggaaaact ggccctgcag 32820 

aacgccctca ctatcactgc ctcacctcct 32880 

accatggaaa acccacttta caacaacaat 32940 

ttgcaagtgg ccaccgactc acatgcacta 33000 

cataacaatt tgctacatac aaaagttaca 33060 

atggaactta aaactggaga tggcctctat 33120 

catattaatc taaataccac aaaaggcctt 33180 

gctggaaaag ggttggaatt tgaaacagac 33240 

attggatcag gcatacaata taataccaat 33300 

ctcagttttg acagctccgg agccataaca 33360 

ctttggacaa caccagaccc atccccaaat 33420 

ctaactctgg cgctaacaaa atgtggcagt 33480 

gtatcaggta atatggcctc catcaatgga 33540 

tttgatgaca acggagtgct tatgtcaaat 33600 

agaaacgggg actccactaa cggtcaacca 33660 

ctaaaagctt acccaaaaac tcaaagtaaa 33720 

tatcttaatg gtgacaagtc taaaccattg 33780 

gaaaccaacc aagtaagcaa atactcaata 33840 

tacactaatg acaaatttgc caccaattcc 33900 

agaatcgtga acctgttgca tgttatgttt 33960 

ttcaagtcat ttt teat tea gtagtatagc 34020 

gtaccttaat caaactcaca gaaccctagt 34080 

agagtacaca gtcctttctc cccggctggc 34140 

catattctta ggtgttatat tccacacggt 34200 

gttaataaac tccccgggca getegcttaa 34260 

aggctgetgt ccaacttgcg gttgetcaac 34320 

gggggtagag teataategt gcatcaggat 34380 

aaactgctgc cgccgccgct ccgtcctgca 34440 

gatgattege accgcccgca geataaggeg 34500 

gatctcactt aagtcagcac agtaactgea 34560 

acagtgcaag gcgctgtatc caaagctcat 34620 

ataccacaag cgcaggtaga ttaagtggcg 34680 

tacctctttt ggcatgttgt aattcaccac 34740 

catggcgcca tccaccacca tcctaaacca 34800 

ctgcagggaa ccgggactgg aacaatgaca 34860 

catcatgctc gtcatgatat caatgttggc 34920 

caggattaca agctcctccc gegtcagaac 34980 

cagegtaaat cccacactgc agggaagacc 35040 

agtgttacat tegggcagea geggatgate 35100 

aaaaggaggt agacgatccc tactgtaegg 35160 
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agtgcgccga 
cgtagtcata 
ggtctcgccg 
ccaggcgccc 
catccaccac 
acacgggagg 
tccaaaacct 
aactctacag 
aggcaaacgg 
tctataaaca 
aatatatctc 
gcgccctcca 
agacctgtat 
ttcgcagggc 
cgccaggaac 
taaccagcgt 
tgctcaaaaa 
gcagataaag 
acatgtctgc 
agaagcctgt 
gccggcgtga 
ggtcatgtcc 
cagtgctaaa 
cattacagcc 
tgaaaaaccc 
ttccacagcg 
acaccactcg 
gcgagtatat 
aaaccgcacg 
cgtcacttcc 
cacatacaag 
cacgtcacaa 
attgatgatg 



gacaaccgag 
tttcctgaag 
cttagatcgc 
cctggcttcg 
cgcagaataa 
agcgggaaga 
caaaatgaag 
ccaaagaaca 
ccctcacgtc 
ttccagcacc 
taagcaaatc 
ccttcagcct 
aagattcaaa 
cagctgaaca 
catgacaaaa 
agccccgatg 
atcaggcaaa 
gcaggtaagc 
gggtttctgc 
cttacaacag 
ccgtaaaaaa 
ggagtcataa 
aagcgaccga 
cccataggag 
tcctgcctag 
gcagccataa 
acacggcacc 
ataggactaa 
cgaacctacg 
gttttcccac 
ttactccgcc 
actccacccc 



atcgtgttgg 
caaaaccagg 
tctgtgtagt 
ggttctatgt 
gccacaccca 
gctggaagaa 
atctattaag 
gataatggca 
caagtggacg 
ttcaaccatg 
ccgaatatta 
caagcagcga 
agcggaacat 
taatcgtgca 
gaacccacac 
taagcttgtt 
gcctcgcgca 
tccggaacca 
ataaacacaa 
gaaaaacaac 
actggtcacc 
tgtaagactc 
aatagcccgg 
gtataacaaa 
gcaaaatagc 
cagtcagcct 
agctcaatca 
aaaatgacgt 
cccagaaacg 
gttacgtcac 
ctaaaaccta 
ctcattatca 



tcgtagtgtc 
tgcgggcgtg 
agttgtagta 
aaactccttc 
gccaacctac 
ccatgttttt 
tgaacgcgct 
tttgtaagat 
taaaggctaa 
cccaaataat 
agtccggcca 
atcatgattg 
taacaaaaat 
ggtctgcacg 
tgattatgac 
gcatgggcgg 
aaaaagaaag 
ccacagaaaa 
aataaaataa 
ccttataagc 
gtgattaaaa 
ggtaaacaca 
gggaatacat 
attaatagga 
accctcccgc 
taccagtaaa 
gtcacagtgt 
aacggttaaa 
aaagccaaaa 
ttcccatttt 
cgtcacccgc 
tattggcttc 



atgccaaatg 
acaaacagat 
tatccactct 
atgcgccgct 
acattcgttc 
ttttttattc 
cccctccggt 
gttgcacaat 
acccttcagg 
tctcatctcg 
ttgtaaaaat 
caaaaattca 
accgcgatcc 
gaccagcgcg 
acgcatactc 
cgatataaaa 
cacatcgtag 
agacaccatt 
caaaaaaaca 
ataagacgga 
agcaccaccg 
tcaggttgat 
acccgcaggc 
gagaaaaaca 
tccagaacaa 
aaagaaaacc 
aaaaaagggc 
gtccacaaaa 
aacccacaac 
aagaaaacta 
cccgttccca 
aatccaaaat 



gaacgccgga 
ctgcgtctcc 
ctcaaagcat 
gccctgataa 
tgcgagtcac 
caaaagatta 
ggcgtggtca 
ggcttccaaa 
gtgaatctcc 
ccaccttctc 
ctgctccaga 
ggttcctcac 
cgtaggtccc 
gccacttccc 
ggagctatgc 
tgcaaggtgc 
tcatgctcat 
tttctctcaa 
tttaaacatt 
ctacggccat 
acagctcctc 
tcacatcggt 
gtagagacaa 
cataaacacc 
catacagcgc 
tattaaaaaa 
caagtgcaga 
aacacccaga 
ttcctcaaat 
caattcccaa 
cgccccgcgc 
aaggtatatt 



35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35940 
36000 
36060 
36120 
36180 
36240 
36300 
36360 
36420 
36480 
36540 
36600 
36660 
36720 
36780 
36840 
36900 
36960 
37020 
37080 
37090 



<210> 5 
<211> 5955 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NS cDNA sequence 



<221> CDS 
<222> (1). 



(5955) 



<400> 5 

atg gcg ccc ate acg gec tac tec caa cag acg egg ggc eta ctt ggt 
Met Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly 
1 5 10 15 

tgc ate ate act age ctt aca ggc egg gac aag aac cag gtc gag gga 
Cys He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly 
20 25 30 



gag 



gtt cag gtg gtt tec ace gca aca caa tec ttc ctg gcg ace tgc 



48 



96 



144 
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Glu Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys 

35 40 45 

gtc aac ggc gtg tgt tgg acc gtt tac cat ggt get ggc tea aag ace 192 

Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr 

50 55 60 

tta gec ggc cca aag ggg cca ate acc cag atg tac act aat gtg gac 240 

Leu Ala Gly Pro Lys Gly Pro He Thr Gin Met Tyr Thr Asn Val Asp 

65 70 75 80 



cag gac etc gtc ggc tgg cag gcg ccc ccc ggg gcg cgt tec ttg aca 
Gin Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr 



65 90 95 



gee gta ccg cag tea ttt caa gtg gee cac eta cac get ccc act ggc 
Ala Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly 
195 200 205 



288 



cca tgc acc tgt ggc age tea gac ctt tac ttg gtc acg aga cat get 336 
Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala 
100 105 110 

gac gtc att ccg gtg cgc egg egg ggc gac agt agg ggg age ctg etc 384 
Asp Val He Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu 
115 120 125 

tec ccc agg cct gtc tec tac ttg aag ggc tct teg ggt ggt cca ctg 432 
Ser Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu 
130 135 140 

etc tgc cct teg ggg cac get gtg ggc ate ttc egg get gee gta tgc 480 
Leu Cys Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys 
145 150 155 160 

acc egg ggg gtt gcg aag gcg gtg gac ttt gtg ccc gta gag tec atg 528 
Thr Arg Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met 
165 170 175 

gaa act act atg egg tct ccg gtc ttc acg gac aac tea tec ccc ccg 576 
Glu Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro 
180 185 190 



624 



age ggc aag agt act aaa gtg ccg get gca tat gca gee caa ggg tac 672 
Ser Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr 
210 ' 215 220 

aag gtg etc gtc etc aat ccg tec gtt gee get acc tta ggg ttt ggg 720 
Lys Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly 
225 230 235 240 

gcg tat atg tct aag gca cac ggt att gac ccc aac ate aga act ggg 768 
Ala Tyr Met Ser Lys Ala His Gly He Asp Pro Asn He Arg Thr Gly 
245 250 255 
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ggc aca gtc ctg gac caa gcg gag acg get gga gcg egg ctt gtc gtg 

Gly Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val 

305 310 315 320 

etc gee ace get acg cct ccg gga teg gtc ace gtg cca cac cca aac 

Leu Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn 
325 330 335 

ate gag gag gtg gee ctg tct aat act gga gag ate ccc ttc tat ggc 

He Glu Glu Val Ala Leu Ser Asn Thr Gly Glu He Pro Phe Tyr Gly 
340 345 350 

aaa gee ate ccc att gaa gec ate agg ggg gga agg cat etc att ttc 

Lys Ala He Pro He Glu Ala He Arg Gly Gly Arg His Leu He Phe 
355 360 365 



acg ggc tat acg ggc gac ttt gac tea gtg ate gac tgt aac aca tgt 
Thr Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys 
420 425 430 



gta agg ace att ace aca ggc gee ccc gtc aca tac tct ace tat ggc 816 

Val Arg Thr He Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly 
260 265 270 

aag ttt ctt gec gat ggt ggt tgc tct ggg ggc get tat gac ate ata 864 

Lys Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He 

275 280 285 

ata tgt gat gag tgc cat tea act gac teg act aca ate ttg ggc ate 912 

He Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr He Leu Gly He 

290 295 300 



960 



1008 



1056 



1104 



tgt cat tec aag aag aag tgc gac gag etc gee gca aag ctg tea ggc 1152 
Cys His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly 
370 ~ 375 380 

etc gga ate aac get gtg gcg tat tac egg ggg etc gat gtg tec gtc 
Leu Gly He Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val 
385 390 395 400 



1200 



ata cca act ate gga gac gtc gtt gtc gtg gca aca gac get ctg atg 1248 
He Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met 
405 410 415 



1296 



gtc ace cag aca gtc gac ttc age ttg gat ccc ace ttc acc att gag 1344 
Val Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu 
435 440 445 

acg acg acc gtg cct caa gac gca gtg teg cgc teg cag egg egg ggt 1392 
Thr Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly 
450 455 460 

agg act ggc agg ggt agg aga ggc ate tac agg ttt gtg act ccg gga 1440 
Arg Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly 
465 470 475 480 
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gaa egg ccc teg ggc atg ttc gat tec teg gtc ctg tgt gag tgc tat 1488 
Glu Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr 
485 490 495 

gac gcg ggc tgt get tgg tac gag etc ace ccc gee gag acc teg gtt 1536 
Asp Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val 
500 505 510 

agg ttg egg gec tac ctg aac aca cca ggg ttg ccc gtt tgc cag gac 1584 
Arg Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp 
515 520 525 

cac ctg gag ttc tgg gag agt gtc ttc aca ggc etc acc cac ata gat 1632 
His Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp 
530 535 540 

gca cac ttc ttg tec cag acc aag cag gca gga gac aac ttc ccc tac 1680 
Ala His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr 
545 550 555 560 

ctg gta gca tac caa gee acg gtg tgc gee agg get cag gee cca cct 1728 
Leu Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro 
565 570 575 

cca tea tgg gat caa atg tgg aag tgt etc ata egg ctg aaa cct acg 1776 
Pro Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr 
580 585 590 

ctg cac ggg cca aca ccc ttg ctg tac agg ctg gga gee gtc caa aat 1824 
Leu His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn 
595 600 605 

gag gtc acc etc acc cac ccc ata acc aaa tac ate atg gca tgc atg 1872 
Glu Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met 
610 615 620 

teg get gac ctg gag gtc gtc act age acc tgg gtg ctg gtg ggc gga 1920 
Ser Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly 
625 - 630 635 640 

gtc ctt gca get ctg gee gcg tat tgc ctg aca aca ggc agt gtg gtc 1968 
Val Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val 
645 650 655 

att gtg ggt agg att ate ttg tec ggg agg ccg get att gtt ccc gac 2016 
lie Val Gly Arg He lie Leu Ser Gly Arg Pro Ala lie Val Pro Asp 
660 665 670 

agg gag ttt etc tac cag gag ttc gat gaa atg gaa gag tgc gee teg 2064 
Arg Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser 
675 680 685 

cac etc cct tac ate gag cag gga atg cag etc gee gag caa ttc aag 2112 
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His Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys 

690 695 700 

cag aaa gcg etc ggg tta ctg caa aca gec acc aaa caa gcg gag get 

Gin Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala 

705 710 715 720 

get get ccc gtg gtg gag tec aag tgg cga gee ctt gag aca ttc tgg 

Ala Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp 

725 730 735 



ccc tec acc gag gac ctg gtc aat eta ctt cct gee ate etc tct cct 

Pro Ser Thr Glu Asp Leu Val Asn Leu Leu Pro Ala He Leu Ser Pro 
850 855 860 

ggc gee ctg gtc gtc ggg gtc gtg tgt gca gca ata ctg cgt cga cac 

Gly Ala Leu Val Val Gly Val Val Cys Ala Ala He Leu Arg Arg His 

865 870 875 880 

gtg ggt ccg gga gag ggg get gtg cag tgg atg aac egg ctg ata gcg 

Val Gly Pro Gly Glu Gly Ala Val Gin Trp Met Asn Arg Leu He Ala 

885 890 895 



2160 



2208 



gcg aag cac atg tgg aat ttc ate age ggg ata cag tac tta gca ggc 2256 

Ala Lys His Met Trp Asn Phe He Ser Gly He Gin Tyr Leu Ala Gly 
740 745 750 

tta tec act ctg cct ggg aac ccc gca ata gca tea ttg atg gca ttc 

Leu Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe 

755 760 765 



2304 



aca gec tct ate acc age ccg etc acc acc caa agt acc etc ctg ttt 2352 
Thr Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe 
770 775 780 

aac ate ttg ggg ggg tgg gtg get gec caa etc gee ccc ccc age gec 
Asn He Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala 
785 790 795 800 

get teg get ttc gtg ggc gec ggc ate gec ggt gcg get gtt ggc age 
Ala Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser 
805 810 815 

ata ggc ctt ggg aag gtg ctt gtg gac att ctg gcg ggt tat gga gca 
He Gly Leu Gly Lys Val Leu Val Asp He Leu Ala Gly Tyr Gly Ala 
820 825 830 

gga gtg gee ggc gcg etc gtg gee ttc aag gtc atg age ggc gag atg 2544 
Gly Val Ala Gly Ala Leu Val Ala Phe Lys Val Met Ser Gly Glu Met 
835 840 845 



2400 



2448 



2496 



2592 



2640 



2688 



ttc gec teg egg ggt aat cat gtt tec ccc acg cac tat gtg cct gag 2736 
Phe Ala Ser Arg Gly Asn His Val Ser Pro Thr His Tyr Val Pro Glu 
900 905 910 
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age gac gec gca gcg cgt gtt act cag ate etc tec age ctt acc ate 
Ser Asp Ala Ala Ala Arg Val Thr Gin He Leu Ser Ser Leu Thr He 
915 920 925 



2784 



act cag ctg ctg aaa agg etc cac cag tgg att aat gaa gac tgc tec 
Thr Gin Leu Leu Lys Arg Leu His Gin Trp He Asn Glu Asp Cys Ser 
930 935 940 



2832 



aca ccg tgt tec ggc teg tgg eta agg gat gtt tgg gac tgg ata tgc 
Thr Pro Cys Ser Gly Ser Trp Leu Arg Asp Val Trp Asp Trp He Cys 
945 950 955 960 



2880 



acg gtg ttg act gac ttc aag acc tgg etc cag tec aag etc ctg ccg 
Thr Val Leu Thr Asp Phe Lys Thr Trp Leu Gin Ser Lys Leu Leu Pro 
965 970 975 



2928 



cag eta ccg gga gtc cct ttt ttc teg tgc caa cgc ggg tac aag gga 
Gin Leu Pro Gly Val Pro Phe Phe Ser Cys Gin Arg Gly Tyr Lys Gly 
980 985 990 



2976 



gtc tgg egg gga gac ggc ate atg caa acc acc tgc cca tgt gga gca 
Val Trp Arg Gly Asp Gly He Met Gin Thr Thr Cys Pro Cys Gly Ala 
995 1000 1005 



3024 



cag ate acc gga cat gtc aaa aac ggt tec atg agg ate gtc ggg cct 
Gin He Thr Gly His Val Lys Asn Gly Ser Met Arg He Val Gly Pro 
1010 1015 1020 



3072 



aag acc tgc age aac acg tgg cat gga aca ttc ccc ate aac gca tac 
Lys Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro lie Asn Ala Tyr 
1025 1030 1035 1040 



3120 



acc acg ggc ccc tgc aca ccc tct cca gcg cca aac tat tct agg gcg 
Thr Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn Tyr Ser Arg Ala 
1045 1050 1055 



3168 



ctg tgg egg gtg gee get gag gag tac gtg gag gtc acg egg gtg ggg 
Leu Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val Thr Arg Val Gly 
1060 1065 1070 



3216 



gat ttc cac tac gtg acg ggc atg acc act gac aac gta aag tgc cca 
Asp Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro 
1075 1080 1085 

tgc cag gtt ccg get cct gaa ttc ttc acg gag gtg gac gga gtg egg 
Cys Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg 
1090 1095 1100 



3264 



3312 



ttg cac agg tac get ccg gcg tgc agg cct etc eta egg gag gag gtt 
Leu His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val 
1105 1110 1115 1120 



3360 



aca ttc cag gtc ggg etc aac caa tac ctg gtt ggg tea cag eta cca 
Thr Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro 
1125 1130 1135 



3408 
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tgc gag ccc gaa ccg gat gta gca gtg etc act tec atg etc ace gac 
Cys Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp 
1140 1145 1150 

ccc tec cac ate aca gca gaa acg get aag cgt agg ttg gee agg ggg 
pro Ser His lie Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly 
1155 H60 1165 

tct ccc ccc tec ttg gee age tct tea get age cag ttg tct gcg cct 
Ser Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro 
1170 H75 1180 

tec ttg aag gcg aca tgc act ace cac cat gtc tct ccg gac get gac 
Ser Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp 
1185 1190 1195 .1200 

etc ate gag gee aac etc ctg tgg egg cag gag atg ggc ggg aac ate 
Leu He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He 
1205 1210 1215 

acc cgc gtg gag teg gag aac aag gtg gta gtc ctg gac tct ttc gac 
Thr Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp 
1220 1225 1230 



cgc ccg gat tac aac cct cca ctg tta gag tec tgg aag gac ccg gac 
Arg Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp 
1265 * ' 1270 1275 1280 

tac gtc cct ccg gtg gtg cac ggg tgc ccg ttg cca cct ate aag gee 
Tyr Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro He Lys Ala 
1285 1290 1295 

cct cca ata cca cct cca egg aga aag agg acg gtt gtc eta aca gag 
Pro Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu 
1300 1305 1310 

tec tec gtg tct tct gee tta gcg gag etc get act aag acc ttc ggc 
Ser Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly 
1315 1320 1325 

age tec gaa tea teg gec gtc gac age ggc acg gcg acc gee ctt cct 
Ser Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro 
1330 1335 1340 

gac cag gee tec gac gac ggt gac aaa gga tec gac gtt gag teg tac 



3456 



3504 



3552 



3600 



3648 



3696 



ccg ctt cga gcg gag gag gat gag agg gaa gta tec gtt ccg gcg gag 3744 
Pro Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu 
1235 1240 1245 

ate ctg egg aaa tec aag aag ttc ccc gca gcg atg ccc ate tgg gcg 
He Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala 
1250 1255 1260 



3792 



3840 



3888 



3936 



3984 



4032 



4080 
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Asp Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr 
1345 1350 1355 1360 

tec tec atg ccc ccc ctt gag ggg gaa ccg ggg gac ccc gat etc agt 4128 
Ser Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser 
1365 1370 1375 

gac ggg tct tgg tct acc gtg age gag gaa get agt gag gat gtc gtc 4176 
Asp Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val 
1380 1385 1390 

tgc tgc tea atg tec tac aca tgg aca ggc gee ttg ate acg cca tgc 4224 
Cys Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu lie Thr Pro Cys 
1395 1400 1405 

get gcg gag gaa age aag ctg ccc ate aac gcg ttg age aac tct ttg 4272 
Ala Ala Glu Glu Ser Lys Leu Pro lie Asn Ala Leu Ser Asn Ser Leu 
1410 1415 1420 

ctg cgc cac cat aac atg gtt tat gee aca aca tct cgc age gca ggc 4320 
Leu Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly 
1425 1430 1435 1440 

ctg egg cag aag aag gtc acc ttt gac aga ctg caa gtc ctg gac gac 4368 
Leu Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp 
1445 1450 1455 

cac tac egg gac gtg etc aag gag atg aag gcg aag gcg tec aca gtt 4416 
His Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val 
1460 1465 1470 

aag get aaa etc eta tec gta gag gaa gee tgc aag ctg acg ccc cca 4464 
Lys Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro 
1475 1480 1485 

cat teg gee aaa tec aag ttt ggc tat ggg gca aag gac gtc egg aac 4512 
His Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn 
1490 1495 1500 

eta tec age aag gee gtt aac cac ate cac tec gtg tgg aag gac ttg 4560 
Leu Ser Ser Lys Ala Val Asn His He His Ser Val Trp Lys Asp Leu 
1505 1510 1515 1520 

ctg gaa gac act gtg aca cca att gac acc acc ate atg gca aaa aat 4608 
Leu Glu Asp Thr Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn 
1525 1530 • 1535 

gag gtt ttc tgt gtc caa cca gag aaa gga ggc cgt aag cca gee cgc 4656 
Glu Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg 
1540 1545 1550 

ctt ate gta ttc cca gat ctg gga gtc cgt gta tgc gag aag atg gee 4704 
Leu He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala 
1555 1560 1565 
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etc tat gat gtg gtc tec acc ctt cct cag gtc gtg atg ggc tec tea 
Leu Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser 
1570 1575 1580 

tac gga ttc cag tac tct cct ggg cag cga gtc gag ttc ctg gtg aat 
Tyr Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn 
1585 1590 1595 1600 

acc tgg aaa tea aag aaa aac ccc atg ggc ttt tea tat gac act cgc 
Thr Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg 
1605 1610 1615 

tgt ttc gac tea acg gtc acc gag aac gac ate cgt gtt gag gag tea 
Cys Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser 
1620 1625 1630 

att tac caa tgt tgt gac ttg gee ccc gaa gec aga cag gec ata aaa 
He Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala He Lys 
1635 "* 1640 1645 

teg etc aca gag egg ctt tat ate ggg ggt cct ctg act aat tea aaa 
Ser Leu Thr Glu Arg Leu Tyr He Gly Gly Pro Leu Thr Asn Ser Lys 
1650 1655 1660 

ggg cag aac tgc ggt tat cgc egg tgc cgc gcg age ggc gtg ctg acg 
Gly Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr 
1665 1670 1675 1680 

act age tgc ggt aac acc etc aca tgt tac ttg aag gec tct gca gec 
Thr Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala 
1685 1690 1695 

tgt cga get gcg aag etc cag gac tgc acg atg etc gtg aac gga gac 
Cys Arg Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp 
1700 1705 1710 

gac ctt gtc gtt ate tgt gaa age gcg gga acc caa gag gac gcg gcg 
Asp Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala 
1715 1720 1725 

age eta cga gtc ttc acg gag get atg act agg tac tct gee ccc ccc 
Ser Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro 
1730 1735 1740 

ggg gac ccg ccc caa cca gaa tac gac ttg gag ctg ata aca tea tgt 
Glv Asp Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys 
1745 1750 1755 1760 

tec tec aat gtg teg gtc gec cac gat gca tea ggc aaa agg gtg tac 
Ser Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr 
1765 1770 1775 

tac etc acc cgt gat ccc acc acc ccc etc gca egg get gcg tgg gaa 
Tyr Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu 
1780 1785 1790 



4752 



4800 



4848 



4896 



4944 



4992 



5040 



5088 



5136 



5184 



5232 



5280 



5328 



5376 
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aca get aga cac act cca gtt aac tec tgg eta ggc aac att ate atg 5424 
Thr Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met 
1795 1800 1805 

tat gcg ccc act ttg tgg gca agg atg att ctg atg act cac ttc ttc 5472 
Tyr Ala Pro Thr Leu Trp Ala Arg Met He Leu Met Thr His Phe Phe 
1810 1815 1820 

tec ate ctt eta gca cag gag caa ctt gaa aaa gec ctg gac tgc cag 5520 
Ser He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin 
1825 1830 1835 1840 

ate tac ggg gee tgt tac tec att gag cca ctt gac eta cct cag ate 5568 
lie Tyr Gly Ala Cys Tyr Ser lie Glu Pro Leu Asp Leu Pro Gin lie 
1845 1850 1855 

att gaa cga etc cat ggc ctt age gca ttt tea etc cat agt tac tct 5616 
He Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser 
1860 1865 1870 

cca ggt gag ate aat agg gtg get tea tgc etc agg aaa ctt ggg gta 5664 
Pro Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val 
1875 1880 1885 

cca ccc ttg cga gtc tgg aga cat egg gee agg age gtc cgc get agg 5712 
Pro Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg 
1890 1895 1900 

eta ctg tec cag ggg ggg agg gee gee act tgt ggc aag tac etc ttc 5760 
Leu Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe 
1905 1910 1915 1920 

aac tgg gca gtg aag ace aaa etc aaa etc act cca ate ccg get gcg 5808 
Asn Trp Ala Val Lys Thr Lys Leu Lys Leu Thr Pro lie Pro Ala Ala 
1925 1930 1935 

tec cag ctg gac ttg tec ggc tgg ttc gtt get ggt tac age ggg gga 5856 
Ser Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly 
1940 1945 1950 

gac ata tat cac age ctg tct cgt gee cga ccc cgc tgg ttc atg ctg 5904 
Asp He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu 
1955 I960 1965 

tgc eta etc eta ctt tct gta ggg gta ggc ate tac ctg etc ccc aac 5952 
Cys Leu Leu Leu Leu Ser Val Gly Val Gly lie Tyr Leu Leu Pro Asn 
1970 1975 1980 



cga 
Arg 
1985 

<210> 6 
<211> 1984 



5955 



30/64 



WO 03/031588 



PCT/US02/32512 



<212> PRT 

<213> Artificial Sequence 
<220> 

<223> NS sequence 

<400> 6 _ _ 

Ala Pro He Thr Ala Tyr Ser Gin Gin Thr Arg Gly Leu Leu Gly Cys 

15 10 15 

He He Thr Ser Leu Thr Gly Arg Asp Lys Asn Gin Val Glu Gly Glu 

20 25 30 

Val Gin Val Val Ser Thr Ala Thr Gin Ser Phe Leu Ala Thr Cys Val 

35 40 45 

Asn Gly val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr Leu 

50 55 60 

Ala Glv Pro Lys Gly Pro lie Thr Gin Met Tyr Thr Asn Val Asp Gin 
65 70 75 80 

Asp Leu Val Gly Trp Gin Ala Pro Pro Gly Ala Arg Ser Leu Thr Pro 

85 90 95 

Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala Asp 

100 105 HO 

Val lie Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu Ser 

115 120 125 

Pro Arg Pro Val Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu Leu 

130 135 140 

Cvs Pro Ser Gly His Ala Val Gly He Phe Arg Ala Ala Val Cys Thr 
145 ' 150 155 160 

Ara Gly Val Ala Lys Ala Val Asp Phe Val Pro Val Glu Ser Met Glu 

165 170 175 

Thr Thr Met Arg Ser Pro Val Phe Thr Asp Asn Ser Ser Pro Pro Ala 

180 185 190 

Val Pro Gin Ser Phe Gin Val Ala His Leu His Ala Pro Thr Gly Ser 

195 200 205 

Gly Lys Ser Thr Lys Val Pro Ala Ala Tyr Ala Ala Gin Gly Tyr Lys 

210 215 220 

Val Leu Val Leu Asn Pro Ser Val Ala Ala Thr Leu Gly Phe Gly Ala 
225 230 235 240 

Tyr Met Ser Lys Ala His Gly He Asp Pro Asn lie Arg Thr Gly Val 

245 250 255 

Arg Thr lie Thr Thr Gly Ala Pro Val Thr Tyr Ser Thr Tyr Gly Lys 

260 265 270 

Phe Leu Ala Asp Gly Gly Cys Ser Gly Gly Ala Tyr Asp He He He 

275 280 285 

Cys Asp Glu Cys His Ser Thr Asp Ser Thr Thr lie Leu Gly lie Gly 

290 295 300 

Thr Val Leu Asp Gin Ala Glu Thr Ala Gly Ala Arg Leu Val Val Leu 
305 310 315 320 

Ala Thr Ala Thr Pro Pro Gly Ser Val Thr Val Pro His Pro Asn lie 

325 330 335 

Glu Glu Val Ala Leu Ser Asn Thr Gly Glu lie Pro Phe Tyr Gly Lys 

340 345 350 

Ala lie Pro He Glu Ala lie Arg Gly Gly Arg His Leu lie Phe Cys 

355 360 365 

His Ser Lys Lys Lys Cys Asp Glu Leu Ala Ala Lys Leu Ser Gly Leu 
370 * 375 380 
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Gly lie Asn Ala Val Ala Tyr Tyr Arg Gly Leu Asp Val Ser Val He 
385 390 395 400 

Pro Thr He Gly Asp Val Val Val Val Ala Thr Asp Ala Leu Met Thr 

405 410 415 

Gly Tyr Thr Gly Asp Phe Asp Ser Val He Asp Cys Asn Thr Cys Val 

420 425 430 

Thr Gin Thr Val Asp Phe Ser Leu Asp Pro Thr Phe Thr He Glu Thr 

435 440 445 

Thr Thr Val Pro Gin Asp Ala Val Ser Arg Ser Gin Arg Arg Gly Arg 

450 455 460 

Thr Gly Arg Gly Arg Arg Gly He Tyr Arg Phe Val Thr Pro Gly Glu 
465 470 475 480 

Arg Pro Ser Gly Met Phe Asp Ser Ser Val Leu Cys Glu Cys Tyr Asp 

485 490 495 

Ala Gly Cys Ala Trp Tyr Glu Leu Thr Pro Ala Glu Thr Ser Val Arg 

500 505 510 

Leu Arg Ala Tyr Leu Asn Thr Pro Gly Leu Pro Val Cys Gin Asp His 

515 520 525 

Leu Glu Phe Trp Glu Ser Val Phe Thr Gly Leu Thr His He Asp Ala 

530 535 540 

His Phe Leu Ser Gin Thr Lys Gin Ala Gly Asp Asn Phe Pro Tyr Leu 
545 550 555 560 

Val Ala Tyr Gin Ala Thr Val Cys Ala Arg Ala Gin Ala Pro Pro Pro 

565 570 575 

Ser Trp Asp Gin Met Trp Lys Cys Leu He Arg Leu Lys Pro Thr Leu 

580 585 590 

His Gly Pro Thr Pro Leu Leu Tyr Arg Leu Gly Ala Val Gin Asn Glu 

595 600 605 

Val Thr Leu Thr His Pro He Thr Lys Tyr He Met Ala Cys Met Ser 

610 615 620' 

Ala Asp Leu Glu Val Val Thr Ser Thr Trp Val Leu Val Gly Gly Val 
625 630 635 640 

Leu Ala Ala Leu Ala Ala Tyr Cys Leu Thr Thr Gly Ser Val Val He 

645 650 655 

Val Gly Arg He He Leu Ser Gly Arg Pro Ala lie Val Pro Asp Arg 

660 665 670 

Glu Phe Leu Tyr Gin Glu Phe Asp Glu Met Glu Glu Cys Ala Ser His 

675 680 685 

Leu Pro Tyr He Glu Gin Gly Met Gin Leu Ala Glu Gin Phe Lys Gin 

690 695 700 

Lys Ala Leu Gly Leu Leu Gin Thr Ala Thr Lys Gin Ala Glu Ala Ala 
705 710 715 720 

Ala Pro Val Val Glu Ser Lys Trp Arg Ala Leu Glu Thr Phe Trp Ala 

725 730 735 

Lys His Met Trp Asn Phe lie Ser Gly lie Gin Tyr Leu Ala Gly Leu 

740 745 750 

Ser Thr Leu Pro Gly Asn Pro Ala He Ala Ser Leu Met Ala Phe Thr 

755 760 765 

Ala Ser He Thr Ser Pro Leu Thr Thr Gin Ser Thr Leu Leu Phe Asn 

770 775 780 

lie Leu Gly Gly Trp Val Ala Ala Gin Leu Ala Pro Pro Ser Ala Ala 
785 790 795 800 

Ser Ala Phe Val Gly Ala Gly He Ala Gly Ala Ala Val Gly Ser He 

805 810 815 

Gly Leu Gly Lys Val Leu Val Asp lie Leu Ala Gly Tyr Gly Ala Gly 
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825 


830 


Phe 


Lys Val Met Ser 


Gly Glu 


840 


845 


Leu 


Leu Pro Ala He 


Leu Ser 




860 




Cys 


Ala Ala He Leu 


Arg Arg 


875 




Gin 


Trp Met Asn Arg 


Leu He 




890 




Ser 


Pro Thr His Tyr 


Val Pro 




905 


910 


Gin 


He Leu Ser Ser 


Leu Thr 


920 




925 


Gin 


Trp He Asn Glu 


Asp Cys 




940 




Arg 


Asp Val Trp Asp 


Trp He 




955 




Trp 


Leu Gin Ser Lys 


Leu Leu 




970 




Ser 


Cys Gin Arg Gly 


Tyr Lys 




985 


990 


Gin 


Thr Thr Cys Pro 


Cys Gly 


1000 


1005 


Gly Ser Met Arg He 


Val Gly 



820 

Val Ala Gly Ala Leu Val Ala 
835 

Ser Thr Glu Asp Leu Val Asn 
850 855 
Ala Leu Val Val Gly Val Val 
865 870 
Gly Pro Gly Glu Gly Ala Val 
885 

Ala Ser Arg Gly Asn His Val 
900 

Asp Ala Ala Ala Arg Val Thr 
915 

Gin Leu Leu Lys Arg Leu His 
930 935 
Pro Cys Ser Gly Ser Trp Leu 
945 950 
Val Leu Thr Asp Phe Lys Thr 
965 

Leu Pro Gly Val Pro Phe Phe 
980 

Trp Arg Gly Asp Gly He Met 
995 

He Thr Gly His Val Lys Asn 

1010 1015 1020 

Thr Cys Ser Asn Thr Trp His Gly Thr Phe Pro I" 
1025 1030 1035 

Thr Gly Pro Cys Thr Pro Ser Pro Ala Pro Asn 1 

1045 1050 
Trp Arg Val Ala Ala Glu Glu Tyr Val Glu Val T 

1060 1065 1070 

Phe His Tyr Val Thr Gly Met Thr Thr Asp Asn Val Lys Cys Pro Cys 

1075 1080 1085 

Gin Val Pro Ala Pro Glu Phe Phe Thr Glu Val Asp Gly Val Arg Leu 

1090 1095 1100 

His Arg Tyr Ala Pro Ala Cys Arg Pro Leu Leu Arg Glu Glu Val Thr 
1105 IHO 1H5 H20 

Phe Gin Val Gly Leu Asn Gin Tyr Leu Val Gly Ser Gin Leu Pro Cys 

U25 1130 H35 

Glu Pro Glu Pro Asp Val Ala Val Leu Thr Ser Met Leu Thr Asp Pro 

1140 H45 H50 

Ser His He Thr Ala Glu Thr Ala Lys Arg Arg Leu Ala Arg Gly Ser 

1155 1160 H65 

Pro Pro Ser Leu Ala Ser Ser Ser Ala Ser Gin Leu Ser Ala Pro Ser 

1170 H75 H80 

Leu Lys Ala Thr Cys Thr Thr His His Val Ser Pro Asp Ala Asp Leu 
1185 1190 H95 1200 

He Glu Ala Asn Leu Leu Trp Arg Gin Glu Met Gly Gly Asn He Thr 
~ - w 1205 1210 1215 

Arg Val Glu Ser Glu Asn Lys Val Val Val Leu Asp Ser Phe Asp Pro 

1220 1225 1230 

Leu Arg Ala Glu Glu Asp Glu Arg Glu Val Ser Val Pro Ala Glu He 

1235 1240 1245 

Leu Arg Lys Ser Lys Lys Phe Pro Ala Ala Met Pro He Trp Ala Arg 
1250 1255 1260 



Met 


Pro 


Pro 


Gly 


His 


Val 




880 


Ala 


Phe 


O -7 D 




Glu 


Ser 


He 


Thr 


Ser 


Thr 


Cys 


Thr 




960 


Pro 


Gin 


975 




Gly 


Val 


Ala 


Gin 


Pro 


Lys 


Tyr 


Thr 




1040 


Ala 


Leu 


1055 


Gly Asp 
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Pro Asp Tyr Asn Pro Pro Leu Leu Glu Ser Trp Lys Asp Pro Asp Tyr 
1265 1270 1275 1280 

Val Pro Pro Val Val His Gly Cys Pro Leu Pro Pro lie Lys Ala Pro 

1285 1290 1295 

Pro He Pro Pro Pro Arg Arg Lys Arg Thr Val Val Leu Thr Glu Ser 

1300 1305 1310 

Ser Val Ser Ser Ala Leu Ala Glu Leu Ala Thr Lys Thr Phe Gly Ser 

1315 1320 1325 

Ser Glu Ser Ser Ala Val Asp Ser Gly Thr Ala Thr Ala Leu Pro Asp 

1330 1335 1340 

Gin Ala Ser Asp Asp Gly Asp Lys Gly Ser Asp Val Glu Ser Tyr Ser 
1345 1350 1355 1360 

Ser Met Pro Pro Leu Glu Gly Glu Pro Gly Asp Pro Asp Leu Ser Asp 

1365 1370 1375 

Gly Ser Trp Ser Thr Val Ser Glu Glu Ala Ser Glu Asp Val Val Cys 

1380 1385 1390 

Cys Ser Met Ser Tyr Thr Trp Thr Gly Ala Leu He Thr Pro Cys Ala 

1395 1400 1405 

Ala Glu Glu Ser Lys Leu Pro He Asn Ala Leu Ser Asn Ser Leu Leu 

1410 1415 1420 

Arg His His Asn Met Val Tyr Ala Thr Thr Ser Arg Ser Ala Gly Leu 
1425 1430 1435 1440 

Arg Gin Lys Lys Val Thr Phe Asp Arg Leu Gin Val Leu Asp Asp His 

1445 1450 1455 

Tyr Arg Asp Val Leu Lys Glu Met Lys Ala Lys Ala Ser Thr Val Lys 

1460 1465 1470 

Ala Lys Leu Leu Ser Val Glu Glu Ala Cys Lys Leu Thr Pro Pro His 

1475 1480 1485 

Ser Ala Lys Ser Lys Phe Gly Tyr Gly Ala Lys Asp Val Arg Asn Leu 

1490 1495 1500 

Ser Ser Lys Ala Val Asn His He His Ser Val Trp Lys Asp Leu Leu 
1505 1510 1515 1520 

Glu Asp Thr Val Thr Pro He Asp Thr Thr He Met Ala Lys Asn Glu 

1525 1530 1535 

Val Phe Cys Val Gin Pro Glu Lys Gly Gly Arg Lys Pro Ala Arg Leu 

1540 1545 1550 

He Val Phe Pro Asp Leu Gly Val Arg Val Cys Glu Lys Met Ala Leu 

1555 1560 1565 

Tyr Asp Val Val Ser Thr Leu Pro Gin Val Val Met Gly Ser Ser Tyr 

1570 1575 1580 

Gly Phe Gin Tyr Ser Pro Gly Gin Arg Val Glu Phe Leu Val Asn Thr 
1585 1590 1595 1600 

Trp Lys Ser Lys Lys Asn Pro Met Gly Phe Ser Tyr Asp Thr Arg Cys 

1605 1610 1615 

Phe Asp Ser Thr Val Thr Glu Asn Asp He Arg Val Glu Glu Ser He 

1620 1625 1630 

Tyr Gin Cys Cys Asp Leu Ala Pro Glu Ala Arg Gin Ala lie Lys Ser 

1635 1640 1645 

Leu Thr Glu Arg Leu Tyr lie Gly Gly Pro Leu Thr Asn Ser Lys Gly 

1650 1655 1660 

Gin Asn Cys Gly Tyr Arg Arg Cys Arg Ala Ser Gly Val Leu Thr Thr 
1665 1670 1675 1680 

Ser Cys Gly Asn Thr Leu Thr Cys Tyr Leu Lys Ala Ser Ala Ala Cys 
1685 1690 1695 
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Ara Ala Ala Lys Leu Gin Asp Cys Thr Met Leu Val Asn Gly Asp Asp 

1700 1705 1710 

Leu Val Val He Cys Glu Ser Ala Gly Thr Gin Glu Asp Ala Ala Ser 

1715 1720 1725 

Leu Arg Val Phe Thr Glu Ala Met Thr Arg Tyr Ser Ala Pro Pro Gly 

1730 1735 1740 

Asd Pro Pro Gin Pro Glu Tyr Asp Leu Glu Leu He Thr Ser Cys Ser 
1745 1750 1755 1760 

Ser Asn Val Ser Val Ala His Asp Ala Ser Gly Lys Arg Val Tyr Tyr 

1765 1770 1775 

Leu Thr Arg Asp Pro Thr Thr Pro Leu Ala Arg Ala Ala Trp Glu Thr 

1780 1785 • 1790 

Ala Arg His Thr Pro Val Asn Ser Trp Leu Gly Asn He He Met Tyr 

1795 1800 1805 

Ala Pro Thr Leu Trp Ala Arg Met lie Leu Met Thr His Phe Phe Ser 

1810 1815 1820 

He Leu Leu Ala Gin Glu Gin Leu Glu Lys Ala Leu Asp Cys Gin He 
1825 1830 1835 1840 

Tvr Gly Ala Cys Tyr Ser He Glu Pro Leu Asp Leu Pro Gin He He 

1845 1850 1855 

Glu Arg Leu His Gly Leu Ser Ala Phe Ser Leu His Ser Tyr Ser Pro 

I860 1865 1870 

Gly Glu He Asn Arg Val Ala Ser Cys Leu Arg Lys Leu Gly Val Pro 

1875 1880 1885 

Pro Leu Arg Val Trp Arg His Arg Ala Arg Ser Val Arg Ala Arg Leu 

1890 1895 1900 

Leu Ser Gin Gly Gly Arg Ala Ala Thr Cys Gly Lys Tyr Leu Phe Asn 
1905 1910 1915 1920 

TrD Ala Val Lys Thr Lys Leu Lys Leu Thr Pro He Pro Ala Ala Ser 

" 1925 1930 1935 

Gin Leu Asp Leu Ser Gly Trp Phe Val Ala Gly Tyr Ser Gly Gly Asp 

19 40 1945 1950 

He Tyr His Ser Leu Ser Arg Ala Arg Pro Arg Trp Phe Met Leu Cys 

1955 1960 1965 

Leu Leu Leu Leu Ser Val Gly Val Gly He Tyr Leu Leu Pro Asn Arg 
1970 1975 1980 

<210> 7 
<211> 4909 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pVU nucleic acid 
<400> 7 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 
cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 
ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 
accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 
ctattggcca ttgcatacgt tgtatccata tcataatatg tacatttata ttggctcatg 
tccaacatta ccgccatgtt gacattgatt attgactagt tattaatagt aatcaattac 
ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 
cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 
catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 600 

tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 660 

ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 720 

catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 780 

cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 840 

ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 900 

agctcgttta gtgaaccgtc agatcgcctg gagacgccat ccacgctgtt ttgacctcca 960 

tagaagacac cgggaccgat ccagcctccg cggccgggaa cggtgcattg gaacgcggat 1020 

tccccgtgcc aagagtgacg taagtaccgc ctatagactc tataggcaca cccctttggc 1080 

tcttatgcat gctatactgt ttttggcttg gggcctatac acccccgctt ccttatgcta 1140 

taggtgatgg tatagcttag cctataggtg tgggttattg accattattg accactcccc 1200 

tattggtgac gatactttcc attactaatc cataacatgg ctctttgcca caactatctc 1260 

tattggctat atgccaatac tctgtccttc agagactgac acggactctg tatttttaca 1320 

ggatggggtc ccatttatta tttacaaatt cacatataca acaacgccgt cccccgtgcc 1380 

cgcagttttt attaaacata gcgtgggatc tccacgcgaa tctcgggtac gtgttccgga 1440 

catgggctct tctccggtag cggcggagct tccacatccg agccctggtc ccatgcctcc 1500 

agcggctcat ggtcgctcgg cagctccttg ctcctaacag tggaggccag acttaggcac 1560 

agcacaatgc ccaccaccac cagtgtgccg cacaaggccg tggcggtagg gtatgtgtct 1620 

gaaaatgagc gtggagattg ggctcgcacg gctgacgcag atggaagact taaggcagcg 1680 

gcagaagaag atgcaggcag ctgagttgtt gtattctgat aagagtcaga ggtaactccc 1740 

gttgcggtgc tgttaacggt ggagggcagt gtagtctgag cagtactcgt tgctgccgcg 1800 

cgcgccacca gacataatag ctgacagact aacagactgt tcctttccat gggtcttttc 1860 

tgcagtcacc gtccttagat ctaggtacca gatatcagaa ttcagtcgac agcggccgcg 1920 

atctgctgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg tgccttcctt 1980 

gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca 2040 

ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca gcaaggggga 2100 

ggattgggaa gacaatagca ggcatgctgg ggatgcggtg ggctctatgg ccgctgcggc 2160 

caggtgctga agaattgacc cggttcctcc tgggccagaa agaagcaggc acatcccctt 2220 

ctctgtgaca caccctgtcc acgcccctgg ttcttagttc cagccccact cataggacac 2280 

tcatagctca ggagggctcc gccttcaatc ccacccgcta aagtacttgg agcggtctct 2340 

ccctccctca tcagcccacc aaaccaaacc tagcctccaa gagtgggaag aaattaaagc 2400 

aagataggct attaagtgca gagggagaga aaatgcctcc aacatgtgag gaagtaatga 2460 

gagaaatcat agaatttctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 2520 

gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 2580 

ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 2640 

ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 2700 

acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 2760 

tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 2820 

ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 2880 

ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 2940 

ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 3000 

actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 3060 

gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc 3120 

tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 3180 

caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 3240 

atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 3300 

acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa 3360 

ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta 3420 

ccaatgctta atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt 3480 

tgcctgactc gggggggggg ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc 3540 

ataccaggcc tgaatcgccc catcatccag ccagaaagtg agggagccac ggttgatgag 3600 

agctttgttg taggtggacc agttggtgat tttgaacttt tgctttgcca cggaacggtc 3660 

tgcgttgtcg ggaagatgcg tgatctgatc cttcaactca gcaaaagttc gatttattca 3720 

acaaagccgc cgtcccgtca agtcagcgta atgctctgcc agtgttacaa ccaattaacc 3780 

aattctgatt agaaaaactc atcgagcatc aaatgaaact gcaatttatt catatcagga 3840 
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4080 
4140 
4200 



ttatcaatac catatttttg aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg 3900 

cagttccata ggatggcaag atcctggtat cggtctgcga ttccgactcg tccaacatca 3960 

atacaaccta ttaatttccc ctcgtcaaaa ataaggttat caagtgagaa atcaccatga 402 0 
gtgacgactg aatccggtga gaatggcaaa agcttatgca tttctttcca gacttgttca 
acaggccagc cattacgctc gtcatcaaaa tcactcgcat caaccaaacc gttattcatt 
cgtgattgcg cctgagcgag acgaaatacg cgatcgctgt taaaaggaca attacaaaca 

ggaatcgaat gcaaccggcg caggaacact gccagcgcat caacaatatt ttcacctgaa 4260 

tcaggatatt cttctaatac ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac 4320 

catgcatcat caggagtacg gataaaatgc ttgatggtcg gaagaggcat aaattccgtc 4380 

agccagttta gtctgaccat ctcatctgta acatcattgg caacgctacc tttgccatgt 4440 

ttcagaaaca actctggcgc atcgggcttc ccatacaatc gatagattgt cgcacctgat 4500 

tgcccgacat tatcgcgagc ccatttatac ccatataaat cagcatccat gttggaattt 4560 

aatcgcggcc tcgagcaaga cgtttcccgt tgaatatggc tcataacacc ccttgtatta 4620 

ctgtttatgt aagcagacag ttttattgtt catgatgata tatttttatc ttgtgcaatg 4680 

taacatcaga gattttgaga cacaacgtgg ctttcccccc ccccccatta ttgaagcatt 4740 

tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa aaataaacaa 4800 

ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtctaaga aaccattatt 4860 

atcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtc 4909 

<210> 8 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 6 

<400> 8 cn 

catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt bu 

ttgtgacgtg gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 120 
gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 
gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 
taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 

agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 360 

gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 420 

cgggtcaaag ttggcgtttt attattatag tcagctgacg tgtagtgtat ttatacccgg 480 

tgagttcctc aagaggccac tcttgagtgc cagcgagtag agttttctcc tccgagccgc 540 

tccgacaccg ggactgaaaa tgagacatat tatctgccac ggaggtgtta ttaccgaaga 600 

aatggccgcc agtcttttgg accagctgat cgaagaggta ctggctgata atcttccacc 660 

tcctagccat tttgaaccac ctacccttca cgaactgtat gatttagacg tgacggcccc 720 

cgaagatccc aacgaggagg cggtttcgca gatttttccc gactctgtaa tgttggcggt 780 

gcaggaaggg attgacttac tcacttttcc gccggcgccc ggttctccgg agccgcctca 840 

cctttcccgg cagcccgagc agccggagca gagagccttg ggtccggttt ctatgccaaa 900 

ccttgtaccg gaggtgatcg atcttacctg ccacgaggct ggctttccac ccagtgacga 960 

cgaggatgaa gagggtgagg agtttgtgtt agattatgtg gagcaccccg ggcacggttg 1020 

caggtcttgt cattatcacc ggaggaatac gggggaccca gatattatgt gttcgctttg 1080 

ctatatgagg acctgtggca tgtttgtcta cagtaagtga aaattatggg cagtgggtga 1140 

tagagtggtg ggtttggtgt ggtaattttt tttttaattt ttacagtttt gtggtttaaa 1200 

gaattttgta ttgtgatttt tttaaaaggt cctgtgtctg aacctgagcc tgagcccgag 1260 

ccagaaccgg agcctgcaag acctacccgc cgtcctaaaa tggcgcctgc tatcctgaga 1320 

cgcccgacat cacctgtgtc tagagaatgc aatagtagta cggatagctg tgactccggt 1380 

ccttctaaca cacctcctga gatacacccg gtggtcccgc tgtgccccat taaaccagtt 1440 

gccgtgagag ttggtgggcg tcgccaggct gtggaatgta tcgaggactt gcttaacgag 1500 

cctgggcaac ctttggactt gagctgtaaa cgccccaggc cataaggtgt aaacctgtga 1560 

ttgcgtgtgt ggttaacgcc tttgtttgct gaatgagttg atgtaagttt aataaagggt 1620 

gagataatgt ttaacttgca cggcgtgtta aatggggcgg ggcttaaagg gtatataatg 1680 

cgccgtgggc taatcttggt tacatctgac ctcatggagg cttgggagtg tttggaagat 1740 

ttttctgctg tgcgtaactt gctggaacag agctctaaca gtacctcttg gttttggagg 1800 
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tttctgtggg gctcatccca ggcaaagtta gtctgcagaa ttaaggagga ttacaagtgg 1860 

gaatttgaag agcttttgaa atcctgtggt gagctgtttg attctttgaa tctgggtcac 1920 

caggcgcttt tccaagagaa ggtcatcaag actttggatt tttccacacc ggggcgcgct 1980 

gcggctgctg ttgctttttt gagttttata aaggataaat ggagcgaaga aacccatctg 2040 

agcggggggt acctgctgga ttttctggcc atgcatctgt ggagagcggt tgtgagacac 2100 

aagaatcgcc tgctactgtt gtcttccgtc cgcccggcga taataccgac ggaggagcag 2160 

cagcagcagc aggaggaagc caggcggcgg cggcaggagc agagcccatg gaacccgaga 2220 

gccggcctgg accctcggga atgaatgttg tacaggtggc tgaactgtat ccagaactga 2280 

gacgcatttt gacaattaca gaggatgggc aggggctaaa gggggtaaag agggagcggg 2340 

gggcttgtga ggctacagag gaggctagga atctagcttt tagcttaatg accagacacc 2400 

gtcctgagtg tattactttt caacagatca aggataattg cgctaatgag cttgatctgc 2460 

tggcgcagaa gtattccata gagcagctga ccacttactg gctgcagcca ggggatgatt 2520 

ttgaggaggc tattagggta tatgcaaagg tggcacttag gccagattgc aagtacaaga 2580 

tcagcaaact tgtaaatatc aggaattgtt gctacatttc tgggaacggg gccgaggtgg 2640 

agatagatac ggaggatagg gtggccttta gatgtagcat gataaatatg tggccggggg 2700 

tgcttggcat ggacggggtg gttattatga atgtaaggtt tactggcccc aattttagcg 2760 

gtacggtttt cctggccaat accaacctta tcctacacgg tgtaagcttc tatgggttta 2820 

acaatacctg tgtggaagcc tggaccgatg taagggttcg gggctgtgcc ttttactgct 2880 

gctggaaggg ggtggtgtgt cgccccaaaa gcagggcttc aattaagaaa tgcctctttg 2940 

aaaggtgtac cttgggtatc ctgtctgagg gtaactccag ggtgcgccac aatgtggcct 3000 

ccgactgtgg ttgcttcatg ctagtgaaaa gcgtggctgt gattaagcat aacatggtat 3 060 

gtggcaactg cgaggacagg gcctctcaga tgctgacctg ctcggacggc aactgtcacc 3120 

tgctgaagac cattcacgta gccagccact ctcgcaaggc ctggccagtg tttgagcata 3180 

acatactgac ccgctgttcc ttgcatttgg gtaacaggag gggggtgttc ctaccttacc 3240 

aatgcaattt gagtcacact aagatattgc ttgagcccga gagcatgtcc aaggtgaacc 3300 

tgaacggggt gtttgacatg accatgaaga tctggaaggt gctgaggtac gatgagaccc 3360 

gcaccaggtg cagaccctgc gagtgtggcg gtaaacatat taggaaccag cctgtgatgc 3420 

tggatgtgac cgaggagctg aggcccgatc acttggtgct ggcctgcacc cgcgctgagt 3480 

ttggctctag cgatgaagat acagattgag gtactgaaat gtgtgggcgt ggcttaaggg 3540 

tgggaaagaa tatataaggt gggggtctta tgtagttttg tatctgtttt gcagcagccg 3600 

ccgccgccat gagcaccaac tcgtttgatg gaagcattgt gagctcatat ttgacaacgc 3660 

gcatgccccc atgggccggg gtgcgtcaga atgtgatggg ctccagcatt gatggtcgcc 3720 

ccgtcctgcc cgcaaactct actaccttga cctacgagac cgtgtctgga acgccgttgg 3780 

agactgcagc ctccgccgcc gcttcagccg ctgcagccac cgcccgcggg attgtgactg 3840 

actttgcttt cctgagcccg cttgcaagca gtgcagcttc ccgttcatcc gcccgcgatg 3900 

acaagttgac ggctcttttg gcacaattgg attctttgac ccgggaactt aatgtcgttt 3960 

ctcagcagct gttggatctg cgccagcagg tttctgccct gaaggcttcc tcccctccca 4020 

atgcggttta aaacataaat aaaaaaccag actctgtttg gatttggatc aagcaagtgt 4080 

cttgctgtct ttatttaggg gttttgcgcg cgcggtaggc ccgggaccag cggtctcggt 4140 

cgttgagggt cctgtgtatt ttttccagga cgtggtaaag gtgactctgg atgttcagat 4200 

acatgggcat aagcccgtct ctggggtgga ggtagcacca ctgcagagct tcatgctgcg 4260 

gggtggtgtt gtagatgatc cagtcgtagc aggagcgctg ggcgtggtgc ctaaaaatgt 4320 

ctttcagtag caagctgatt gccaggggca ggcccttggt gtaagtgttt acaaagcggt 4380 

taagctggga tgggtgcata cgtggggata tgagatgcat cttggactgt atttttaggt 4440 

tggctatgtt cccagccata tccctccggg gattcatgtt gtgcagaacc accagcacag 4500 

tgtatccggt gcacttggga aatttgtcat gtagcttaga aggaaatgcg tggaagaact 4560 

tggagacgcc cttgtgacct ccaagatttt ccatgcattc gtccataatg atggcaatgg 4620 

gcccacgggc ggcggcctgg gcgaagatat ttctgggatc actaacgtca tagttgtgtt 4680 

ccaggatgag atcgtcatag gccattttta caaagcgcgg gcggagggtg ccagactgcg 4740 

gtataatggt tccatccggc ccaggggcgc agttaccctc acagatttgc atttcccacg 4800 

ctttgagttc agatgggggg atcatgtcta cctgcggggc gatgaagaaa acggtttccg 4860 

gggtagggga gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc 4920 

cggtgggccc gtaaatcaca cctattaccg ggtgcaactg gtagttaaga gagctgcagc 4980 

tgccgtcatc cctgagcagg ggggccactt cgttaagcat gtccctgact cgcatgtttt 5040 

ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag 5100 
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caaagttttt caacggtttg agaccgtccg ccgtaggcat gcttttgagc gtttgaccaa 5160 

gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga tccagcatat 5220 

ctcctcgttt cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag 5280 

acgggccagg gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac 5340 

ggtgaagggg tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct 5400 

ggtgctgaag cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt 5460 

gtcatagtcc agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc 5520 

gccgcacgag gggcagtgca gacttttgag ggcgtagagc ttgggcgcga gaaataccga 5580 

ttccggggag taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca 5640 

ggtgagctct ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt 5700 

cttacctctg gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc 5760 

cccgtataca gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag 5820 

aaactcggac cactctgaga caaaggctcg cgtccaggcc agcacgaagg aggctaagtg 5880 

ggaggggtag cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat 5940 

gtcgccctct tcggcatcaa ggaaggtgat tggtttgtag gtgtaggcca cgtgaccggg 6000 

tgttcctgaa ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc 6060 

atcgctgtct gcgagggcca gctgttgggg tgagtactcc ctctgaaaag cgggcatgac 6120 

ttctgcgcta agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc 6180 

ggtgatgcct ttgagggtgg ccgcatccat ctggtcagaa aagacaatct ttttgttgtc 6240 

aagcttggtg gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag 6300 

ggtttggttt ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc 6360 

gcgcgcaacg caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac 6420 

gcgccaaccg cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag 6480 

gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gtagggggtc 6540 

tagctgcgtc tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc 6600 

gtcgaagtag tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc 6 660 

aagcgcgcgc tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga 6720 

ggcgtacatg ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt 6780 

agggtagcat cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg 6840 

agcgaggagg tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg 6900 

cctgaagatg gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc 6960 

gtctgtgaga cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac 7 020 

cagctcggcg gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc 7080 

atacttatcc tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc 7140 

tttccagtac tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta 7200 

gaactggttg acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg 7260 

cgcggccttc cggagcgagg tgtgggtgag cgcaaaggtg tccctgacca tgactttgag 7320 

gtactggtat ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt 7380 

gcgctttttg gaacgcggat ttggcagggc gaaggtgaca tcgttgaaga gtatctttcc 7440 

cgcgcgaggc ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt 7500 

aattacctgg gcggcgagca cgatctcgtc aaagccgttg atgttgtggc ccacaatgta 7560 

aagttccaag aagcgcggga tgcccttgat ggaaggcaat tttttaagtt cctcgtaggt 7620 

gagctcttca ggggagctga gcccgtgctc tgaaagggcc cagtctgcaa gatgagggtt 7680 

ggaagcgacg aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa 7740 

ggtcctaaac tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg 7800 

gtcttgttcc cagcggtccc atccaaggtt cgcggctagg tctcgcgcgg cagtcactag 7860 

aggctcatct ccgccgaact tcatgaccag catgaagggc acgagctgct tcccaaaggc 7 920 

ccccatccaa gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg 7980 

cgagccgatc gggaagaact ggatctcccg ccaccaattg gaggagtggc tattgatgtg 8040 

gtgaaagtag aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc 8100 

gcagtactgg cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg 8160 

cacaaggaag cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc 8220 

tacttcggct gcttgtcctt gaccgtctgg ctgctcgagg ggagttacgg tggatcggac 8280 

caccacgccg cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac 8340 

aacatcgcgc agatgggagc tgtccatggt ctggagctcc cgcggcgtca ggtcaggcgg 8400 
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gagctcctgc aggtttacct cgcatagacg ggtcagggcg cgggctagat ccaggtgata 8460 

cctaatttcc aggggctggt tggtggcggc gtcgatggct tgcaagaggc cgcatccccg 8520 

cggcgcgact acggtaccgc gcggcgggcg gtgggccgcg ggggtgtcct tggatgatgc 8580 

atctaaaagc ggtgacgcgg gcgagccccc ggaggtaggg ggggctccgg acccgccggg 8640 

agagggggca ggggcacgtc ggcgccgcgc gcgggcagga gctggtgctg cgcgcgtagg 8700 

ttgctggcga acgcgacgac gcggcggttg atctcctgaa tctggcgcct ctgcgtgaag 8760 

acgacgggcc cggtgagctt gagcctgaaa gagagttcga cagaatcaat ttcggtgtcg 8820 

ttgacggcgg cctggcgcaa aatctcctgc acgtctcctg agttgtcttg ataggcgatc 8880 

tcggccatga actgctcgat ctcttcctcc tggagatctc cgcgtccggc tcgctccacg 8940 

gtggcggcga ggtcgttgga aatgcgggcc atgagctgcg agaaggcgtt gaggcctccc 9000 

tcgttccaga cgcggctgta gaccacgccc ccttcggcat cgcgggcgcg catgaccacc 9060 

tgcgcgagat tgagctccac gtgccgggcg aagacggcgt agtttcgcag gcgctgaaag 9120 

aggtagttga gggtggtggc ggtgtgttct gccacgaaga agtacataac ccagcgtcgc 9180 

aacgtggatt cgttgatatc ccccaaggcc tcaaggcgct ccatggcctc gtagaagtcc 9240 

acggcgaagt tgaaaaactg ggagttgcgc gccgacacgg ttaactcctc ctccagaaga 9300 

cggatgagct cggcgacagt gtcgcgcacc tcgcgctcaa aggctacagg ggcctcttct 9360 

tcttcttcaa tctcctcttc cataagggcc tccccttctt cttcttctgg cggcggtggg 9420 

ggagggggga cacggcggcg acgacggcgc accgggaggc ggtcgacaaa gcgctcgatc 9480 

atctccccgc ggcgacggcg catggtctcg gtgacggcgc ggccgttctc gcgggggcgc 9540 

agttggaaga cgccgcccgt catgtcccgg ttatgggttg gcggggggct gccatgcggc 9600 

agggatacgg cgctaacgat gcatctcaac aattgttgtg taggtactcc gccgccgagg 9660 

gacctgagcg agtccgcatc gaccggatcg gaaaacctct cgagaaaggc gtctaaccag 9720 

tcacagtcgc aaggtaggct gagcaccgtg gcgggcggca gcgggcggcg gtcggggttg 9780 

tttctggcgg aggtgctgct gatgatgtaa ttaaagtagg cggtcttgag acggcggatg 9840 

gtcgacagaa gcaccatgtc cttgggtccg gcctgctgaa tgcgcaggcg gtcggccatg 9900 

ccccaggctt cgttttgaca tcggcgcagg tctttgtagt agtcttgcat gagcctttct 9960 

accggcactt cttcttctcc ttcctcttgt cctgcatctc ttgcatctat cgctgcggcg 10020 

gcggcggagt ttggccgtag gtggcgccct cttcctccca tgcgtgtgac cccgaagccc 10080 

ctcatcggct gaagcagggc taggtcggcg acaacgcgct cggctaatat ggcctgctgc 10140 

acctgcgtga gggtagactg gaagtcatcc atgtccacaa agcggtggta tgcgcccgtg 10200 

ttgatggtgt aagtgcagtt ggccataacg gaccagttaa cggtctggtg acccggctgc 10260 

gagagctcgg tgtacctgag acgcgagtaa gccctcgagt caaatacgta gtcgttgcaa 10320 

gtccgcacca ggtactggta tcccaccaaa aagtgcggcg gcggctggcg gtagaggggc 10380 

cagcgtaggg tggccggggc tccgggggcg agatcttcca acataaggcg atgatatccg 10440 

tagatgtacc tggacatcca ggtgatgccg gcggcggtgg tggaggcgcg cggaaagtcg 10500 

cggacgcggt tccagatgtt gcgcagcggc aaaaagtgct ccatggtcgg gacgctctgg 10560 

ccggtcaggc gcgcgcaatc gttgacgctc tagaccgtgc aaaaggagag cctgtaagcg 10620 

ggcactcttc cgtggtctgg tggataaatt cgcaagggta tcatggcgga cgaccggggt 10680 

tcgagccccg tatccggccg tccgccgtga tccatgcggt taccgcccgc gtgtcgaacc 10740 

caggtgtgcg acgtcagaca acgggggagt gctccttttg gcttccttcc aggcgcggcg 10800 

gctgctgcgc tagctttttt ggccactggc cgcgcgcagc gtaagcggtt aggctggaaa 10860 

gcgaaagcat taagtggctc gctccctgta gccggagggt tattttccaa gggttgagtc 10920 

gcgggacccc cggttcgagt ctcggaccgg ccggactgcg gcgaacgggg gtttgcctcc 10980 

ccgtcatgca agaccccgct tgcaaattcc tccggaaaca gggacgagcc ccttttttgc 11040 

ttttcccaga tgcatccggt gctgcggcag atgcgccccc ctcctcagca gcggcaagag 11100 

caagagcagc ggcagacatg cagggcaccc tcccctcctc ctaccgcgtc aggaggggcg 11160 

acatccgcgg ttgacgcggc agcagatggt gattacgaac ccccgcggcg ccgggcccgg 11220 

cactacctgg acttggagga gggcgagggc ctggcgcggc taggagcgcc ctctcctgag 11280 

cggtacccaa gggtgcagct gaagcgtgat acgcgtgagg cgtacgtgcc gcggcagaac 11340 

ctgtttcgcg accgcgaggg agaggagccc gaggagatgc gggatcgaaa gttccacgca 11400 

gggcgcgagc tgcggcatgg cctgaatcgc gagcggttgc tgcgcgagga ggactttgag 11460 

cccgacgcgc gaaccgggat tagtcccgcg cgcgcacacg tggcggccgc cgacctggta 11520 

accgcatacg agcagacggt gaaccaggag attaactttc aaaaaagctt taacaaccac 11580 

gtgcgtacgc ttgtggcgcg cgaggaggtg gctataggac tgatgcatct gtgggacttt 11640 

gtaagcgcgc tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata 11700 
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gtgcagcaca gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc 11760 

gagggccgct ggctgctcga tttgataaac atcctgcaga gcatagtggt gcaggagcgc 11820 

agcttgagcc tggctgacaa ggtggccgcc atcaactatt ccatgcttag cctgggcaag 11880 

ttttacgccc gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc 11940 

gaggggttct acatgcgcat ggcgctgaag gtgcttacct tgagcgacga cctgggcgtt 12000 

tatcgcaacg agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac 12060 

cgcgagctga tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag 12120 

gccgagtcct actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg 12180 

gaggcagctg gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc 12240 

ggcgtggagg aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg 12300 

gtgatgtttc tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc 12360 

agagccagcc gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca 12420 

tgtcgctgac tgcgcgcaat cctgacgcgt tccggcagca gccgcaggcc aaccggctct 12480 

ccgcaattct ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg 12540 

cgatcgtaaa cgcgctggcc gaaaacaggg ccatccggcc cgacgaggcc ggcctggtct 12600 

acgacgcgct gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg 12660 

accggctggt gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg 12720 

gcaacctggg ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc 12780 

cgcggggaca ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga 12840 

caccgcaaag tgaggtgtac cagtctgggc cagactattt tttccagacc agtagacaag 12900 

gcctgcagac cgtaaacctg agccaggctt tcaaaaactt gcaggggctg tggggggtgc 12960 

gggctcccac aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt 13020 

tgctgctgct aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag 13080 

gtcacttgct gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt 13140 

tccaggagat tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg 13200 

caaccctaaa ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa 13260 

acagcgagga ggagcgcatt ttgcgctacg tgcagcagag cgtgagcctt aacctgatgc 13320 

gcgacggggt aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca 13380 

tgtatgcctc aaaccggccg tttatcaacc gcctaatgga ctacttgcat cgcgcggccg 13440 

ccgtgaaccc cgagtatttc accaatgcca tcttgaaccc gcactggcta ccgccccctg 13500 

gtttctacac cgggggattc gaggtgcccg agggtaacga tggattcctc tgggacgaca 13560 

tagacgacag cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc 13620 

aggcagaggc ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag 13680 

gcgctgcggc cccgcggtca gatgctagta gcccatttcc aagcttgata gggtctctta 13740 

ccagcactcg caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc 13800 

tgctgcagcc gcagcgcgaa aaaaacctgc ctccggcatt tcccaacaac gggatagaga 13860 

gcctagtgga caagatgagt agatggaaga cgtacgcgca ggagcacagg gacgtgccag 13920 

gcccgcgccc gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg 13980 

acgatgactc ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg 14040 

cgcaccttcg ccccaggctg gggagaatgt tttaaaaaaa aaaaagcatg atgcaaaata 14100 

aaaaactcac caaggccatg gcaccgagcg ttggttttct tgtattcccc ttagtatgcg 14160 

gcgcgcggcg atgtatgagg aaggtcctcc tccctcctac gagagtgtgg tgagcgcggc 14220 

gccagtggcg gcggcgctgg gttctccctt cgatgctccc ctggacccgc cgtttgtgcc 14280 

tccgcggtac ctgcggccta ccggggggag aaacagcatc cgttactctg agttggcacc 14340 

cctattcgac accacccgtg tgtacctggt ggacaacaag tcaacggatg tggcatccct 14400 

gaactaccag aacgaccaca gcaactttct gaccacggtc attcaaaaca atgactacag 14460 

cccgggggag gcaagcacac agaccatcaa tcttgacgac cggtcgcact ggggcggcga 14520 

cctgaaaacc atcctgcata ccaacatgcc aaatgtgaac gagttcatgt ttaccaataa 14580 

gtttaaggcg cgggtgatgg tgtcgcgctt gcctactaag gacaatcagg tggagctgaa 14640 

atacgagtgg gtggagttca cgctgcccga gggcaactac tccgagacca tgaccataga 14700 

ccttatgaac aacgcgatcg tggagcacta cttgaaagtg ggcagacaga acggggttct 14760 

ggaaagcgac atcggggtaa agtttgacac ccgcaacttc agactggggt ttgaccccgt 14820 

cactggtctt gtcatgcctg gggtatatac aaacgaagcc ttccatccag acatcatttt 14880 

gctgccagga tgcggggtgg acttcaccca cagccgcctg agcaacttgt tgggcatccg 14940 

caagcggcaa cccttccagg agggctttag gatcacctac gatgatctgg agggtggtaa 15000 



41/64 



WO 03/031588 



PCT/US02/32512 



cattcccgca ctgttggatg tggacgccta ccaggcgagc ttgaaagatg acaccgaaca 15060 

gggcgggggt ggcgcaggcg gcagcaacag cagtggcagc ggcgcggaag agaactccaa 15120 

cgcggcagcc gcggcaatgc agccggtgga ggacatgaac gatcatgcca ttcgcggcga 15180 

cacctttgcc acacgggctg aggagaagcg cgctgaggcc gaagcagcgg ccgaagctgc 15240 

cgcccccgct gcgcaacccg aggtcgagaa gcctcagaag aaaccggtga tcaaacccct 15300 

gacagaggac agcaagaaac gcagttacaa cctaataagc aatgacagca ccttcaccca 15360 

gtaccgcagc tggtaccttg catacaacta cggcgaccct cagaccggaa tccgctcatg 15420 

gaccctgctt tgcactcctg acgtaacctg cggctcggag caggtctact ggtcgttgcc 15480 

agacatgatg caagaccccg tgaccttccg ctccacgcgc cagatcagca actttccggt 15540 

ggtgggcgcc gagctgttgc ccgtgcactc caagagcttc tacaacgacc aggccgtcta 15600 

ctcccaactc atccgccagt ttacctctct gacccacgtg ttcaatcgct ttcccgagaa 15660 

ccagattttg gcgcgcccgc cagcccccac catcaccacc gtcagtgaaa acgttcctgc 15720 

tctcacagat cacgggacgc taccgctgcg caacagcatc ggaggagtcc agcgagtgac 15780 

cattactgac gccagacgcc gcacctgccc ctacgtttac aaggccctgg gcatagtctc 15840 

gccgcgcgtc ctatcgagcc gcactttttg agcaagcatg tccatcctta tatcgcccag 15900 

caataacaca ggctggggcc tgcgcttccc aagcaagatg tttggcgggg ccaagaagcg 15960 

ctccgaccaa cacccagtgc gcgtgcgcgg gcactaccgc gcgccctggg gcgcgcacaa 16020 

acgcggccgc actgggcgca ccaccgtcga tgacgccatc gacgcggtgg tggaggaggc 16080 

gcgcaactac acgcccacgc cgccaccagt gtccacagtg gacgcggcca ttcagaccgt 16140 

ggtgcgcgga gcccggcgct atgctaaaat gaagagacgg cggaggcgcg tagcacgtcg 16200 

ccaccgccgc cgacccggca ctgccgccca acgcgcggcg gcggccctgc ttaaccgcgc 16260 

acgtcgcacc ggccgacggg cggccatgcg ggccgctcga aggctggccg cgggtattgt 16320 

cactgtgccc cccaggtcca ggcgacgagc ggccgccgca gcagccgcgg ccattagtgc 16380 

tatgactcag ggtcgcaggg gcaacgtgta ttgggtgcgc gactcggtta gcggcctgcg 16440 

cgtgcccgtg cgcacccgcc ccccgcgcaa ctagattgca agaaaaaact acttagactc 16500 

gtactgttgt atgtatccag cggcggcggc gcgcaacgaa gctatgtcca agcgcaaaat 16560 

caaagaagag atgctccagg tcatcgcgcc ggagatctat ggccccccga agaaggaaga 16620 

gcaggattac aagccccgaa agctaaagcg ggtcaaaaag aaaaagaaag atgatgatga 16680 

tgaacttgac gacgaggtgg aactgctgca cgctaccgcg cccaggcgac gggtacagtg 16740 

gaaaggtcga cgcgtaaaac gtgttttgcg acccggcacc accgtagtct ttacgcccgg 16800 

tgagcgctcc acccgcacct acaagcgcgt gtatgatgag gtgtacggcg acgaggacct 16860 

gcttgagcag gccaacgagc gcctcgggga gtttgcctac ggaaagcggc ataaggacat 16920 

gctggcgttg ccgctggacg agggcaaccc aacacctagc ctaaagcccg taacactgca 16980 

gcaggtgctg cccgcgcttg caccgtccga agaaaagcgc ggcctaaagc gcgagtctgg 17040 

tgacttggca cccaccgtgc agctgatggt acccaagcgc cagcgactgg aagatgtctt 17100 

ggaaaaaatg accgtggaac ctgggctgga gcccgaggtc cgcgtgcggc caatcaagca 17160 

ggtggcgccg ggactgggcg tgcagaccgt ggacgttcag atacccacta ccagtagcac 17220 

cagtattgcc accgccacag agggcatgga gacacaaacg tccccggttg cctcagcggt 17280 

ggcggatgcc gcggtgcagg cggtcgctgc ggccgcgtcc aagacctcta cggaggtgca 17340 

aacggacccg tggatgtttc gcgtttcagc cccccggcgc ccgcgcggtt cgaggaagta 17400 

cggcgccgcc agcgcgctac tgcccgaata tgccctacat ccttccattg cgcctacccc 17460 

cggctatcgt ggctacacct accgccccag aagacgagca actacccgac gccgaaccac 17520 

cactggaacc cgccgccgcc gtcgccgtcg ccagcccgtg ctggccccga tttccgtgcg 17580 

cagggtggct cgcgaaggag gcaggaccct ggtgctgcca acagcgcgct accaccccag 17640 

catcgtttaa aagccggtct ttgtggttct tgcagatatg gccctcacct gccgcctccg 17700 

tttcccggtg ccgggattcc gaggaagaat gcaccgtagg aggggcatgg ccggccacgg 17760 

cctgacgggc ggcatgcgtc gtgcgcacca ccggcggcgg cgcgcgtcgc accgtcgcat 17820 

gcgcggcggt atcctgcccc tccttattcc actgatcgcc gcggcgattg gcgccgtgcc 17880 

cggaattgca tccgtggcct tgcaggcgca gagacactga ttaaaaacaa gttgcatgtg 17940 

gaaaaatcaa aataaaaagt ctggactctc acgctcgctt ggtcctgtaa ctattttgta 18000 

gaatggaaga catcaacttt gcgtctctgg ccccgcgaca cggctcgcgc ccgttcatgg 18060 

gaaactggca agatatcggc accagcaata tgagcggtgg cgccttcagc tggggctcgc 18120 

tgtggagcgg cattaaaaat ttcggttcca ccgttaagaa ctatggcagc aaggcctgga 18180 

acagcagcac aggccagatg ctgagggata agttgaaaga gcaaaatttc caacaaaagg 18240 

tggtagatgg cctggcctct ggcattagcg gggtggtgga cctggccaac caggcagtgc 18300 
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aaaataagat 

tggagacagt 

ctctggtgac 

ccaccacccg 

cgctggacct 

ccgttgttgt 

cgttgcggcc 

gggtgcaatc 

atgtatgcgt 

gatggctacc 

ctcggagtac 

cctgaataac 

gtcccagcgt 

caaggcgcgg 

ctttgacatc 

ctacaacgcc 

tgctcttgaa 

agctgagcag 

aaaggagggt 

tcaacctgaa 

tgggagagtc 

cacaaatgaa 

tcaagtggaa 

gactcctaaa 

ttcttacatg 

gcccaacagg 

cagcacgggt 

tttgcaagac 

aaccaggtac 

tattgaaaat 

gattaataca 

aaaagatgct 

ggaaatcaat 

tttgcccgac 

ctacgactac 

tggagcacgc 

tgctggcctg 

ccaggtgcct 

ctacgagtgg 

cctaagggtt 

ccccatggcc 

ccagtccttt 

taccaacgtg 

cacgcgcctt 

ctactctggc 

ggtggccatt 

caacgagttt 

catgaccaaa 

cttctatatc 

catgagccgt 

acaccaacac 

ggcctaccct 

ccagaaaaag 

gtccatgggc 

gctagacatg 



taacagtaag 

gtctccagag 

gcaaatagac 

tcccatcgcg 

gcctcccccc 

aacccgtcct 

cgtagccagt 

cctgaagcgc 

ccatgtcgcc 

ccttcgatga 

ctgagccccg 

aagtttagaa 

ttgacgctgc 

ttcaccctag 

cgcggcgtgc 

ctggctccca 

ataaacctag 

caaaaaactc 

attcaaatag 

cctcaaatag 

cttaaaaaga 

aatggagggc 

atgcaatttt 

gtggtattgt 

cccactatta 

cctaattaca 

aatatgggtg 

agaaacacag 

ttttctatgt 

catggaactg 

gagactctta 

acagaatttt 

ctaaatgcca 

aagctaaagt 

atgaacaagc 

tggtcccttg 

cgctaccgct 

cagaagttct 

aacttcagga 

gacggagcca 

cacaacaccg 

aacgactatc 

cccatatcca 

aagactaagg 

tctataccct 

acctttgact 

gaaattaagc 

gactggttcc 

ccagagagct 

caggtggtgg 

aacaactctg 

gctaacttcc 

tttctttgcg 

gcactcacag 

acttttgagg 



cttgatcccc 

gggcgtggcg 

gagcctccct 

cccatggcta 

gccgacaccc 

agccgcgcgt 

ggcaactggc 

cgacgatgct 

gccagaggag 

tgccgcagtg 

ggctggtgca 

accccacggt 

ggttcatccc 

ctgtgggtga 

tggacagggg 

agggtgcccc 

aagaagagga 

acgtatttgg 

gtgtcgaagg 

gagaatctca 

ctaccccaat 

aaggcattct 

tctcaactac 

acagtgaaga 

aggaaggtaa 

ttgcttttag 

ttctggcggg 

agctttcata 

ggaatcaggc 

aagatgaact 

ccaaggtaaa 

cagataaaaa 

acctgtggag 

acagtccttc 

gagtggtggc 

actatatgga 

caatgttgct 

ttgccattaa 

aggatgttaa 

gcattaagtt 

cctccacgct 

tctccgccgc 

tcccctcccg 

aaaccccatc 

acctagatgg 

cttctgtcag 

gctcagttga 

tggtacaaat 

acaaggaccg 

atgatactaa 

gatttgttgg 

cctatccgct 

atcgcaccct 

acctgggcca 

tggatcccat 



gccctcccgt 

aaaagcgtcc 

cgtacgagga 

ccggagtgct 

agcagaaacc 

ccctgcgccg 

aaagcacact 

tctgaatagc 

ctgctgagcc 

gtcttacatg 

gtttgcccgc 

ggcgcctacg 

tgtggaccgt 

taaccgtgtg 

ccctactttt 

aaatccttgc 

cgatgacaac 

gcaggcgcct 

tcaaacacct 

gtggtacgaa 

gaaaccatgt 

tgtaaagcaa 

tgaggcgacc 

tgtagatata 

ctcacgagaa 

ggacaatttt 

ccaagcatcg 

ccagcttttg 

tgttgacagc 

tccaaattac 

acctaaaaca 

tgaaataaga 

aaatttcctg 

caacgtaaaa 

tcccgggtta 

caacgtcaac 

gggcaatggt 

aaacctcctt 

catggttctg 

tgatagcatt 

tgaggccatg 

caacatgctc 

caactgggcg 

actgggctcg 

aaccttttac 

ctggcctggc 

cggggagggt 

gctagctaac 

catgtactcc 

atacaaggac 

ctaccttgcc 

tataggcaag 

ttggcgcatc 

aaaccttctc 

ggacgagccc 



agaggagcct 

gcgccccgac 

ggcactaaag 

gggccagcac 

tgtgctgcca 

cgccgccagc 

gaacagcatc 

taacgtgtcg 

gccgcgcgcc 

cacatctcgg 

gccaccgaga 

cacgacgtga 

gaggatactg 

ctggacatgg 

aagccctact 

gaatgggatg 

gaagacgaag 

tattctggta 

aaatatgccg 

actgaaatta 

tacggttcat 

caaaatggaa 

gcaggcaatg 

gaaaccccag 

ctaatgggcc 

attggtctaa 

cagttgaatg 

cttgattcca 

tatgatccag 

tgctttccac 

ggtcaggaaa 

gttggaaata 

tactccaaca 

atttctgata 

gtggactgct 

ccatttaacc 

cgctatgtgc 

ctcctgccgg 

cagagctccc 

tgcctttacg 

cttagaaacg 

taccctatac 

gctttccgcg 

ggctacgacc 

ctcaaccaca 

aatgaccgcc 

tacaacgttg 

tacaacattg 

ttctttagaa 

taccaacagg 

cccaccatgc 

accgcagttg 

ccattctcca 

tacgccaact 

acccttcttt 



ccaccggccg 

agggaagaaa 

caaggcctgc 

acacccgtaa 

ggcccgaccg 

ggtccgcgat 

gtgggtctgg 

tatgtgtgtc 

cgctttccaa 

gccaggacgc 

cgtacttcag 

ccacagaccg 

cgtactcgta 

cttccacgta 

ctggcactgc 

aagctgctac 

tagacgagca 

taaatattac 

ataaaacatt 

atcatgcagc 

atgcaaaacc 

agctagaaag 

gtgataactt 

acactcatat 

aacaatctat 

tgtattacaa 

ctgttgtaga 

ttggtgatag 

atgttagaat 

tgggaggtgt 

atggatggga 

attttgccat 

tagcgctgta 

acccaaacac 

acattaacct 

accaccgcaa 

ccttccacat 

gctcatacac 

taggaaatga 

ccaccttctt 

acaccaacga 

ccgccaacgc 

gctgggcctt 

cttattacac 

cctttaagaa 

tgcttacccc 

cccagtgtaa 

gctaccaggg 

acttccagcc 

tgggcatcct 

gcgaaggaca 

acagcattac 

gtaactttat 

ccgcccacgc 

atgttttgtt 



18360 

18420 

18480 

18540 

18600 

18660 

18720 

18780 

18840 

18900 

18960 
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19080 

19140 

19200 

19260 
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19380 

19440 

19500 

19560 

19620 

19680 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20220 

20280 

20340 

20400 

20460 

20520 

20580 

20640 
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20820 

20880 
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21300 

21360 
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tgaagtcttt gacgtggtcc gtgtgcaccg gccgcaccgc ggcgtcatcg aaaccgtgta 21660 

cctgcgcacg cccttctcgg ccggcaacgc cacaacataa agaagcaagc aacatcaaca 21720 

acagctgccg ccatgggctc cagtgagcag gaactgaaag ccattgtcaa agatcttggt 21780 

tgtgggccat attttttggg cacctatgac aagcgctttc caggctttgt ttctccacac 21840 

aagctcgcct gcgccatagt caatacggcc ggtcgcgaga ctgggggcgt acactggatg 21900 

gcctttgcct ggaacccgca ctcaaaaaca tgctacctct ttgagccctt tggcttttct 21960 

gaccagcgac tcaagcaggt ttaccagttt gagtacgagt cactcctgcg ccgtagcgcc 22020 

attgcttctt cccccgaccg ctgtataacg ctggaaaagt ccacccaaag cgtacagggg 22080 

cccaactcgg ccgcctgtgg actattctgc tgcatgtttc tccacgcctt tgccaactgg 22140 

ccccaaactc ccatggatca caaccccacc atgaacctta ttaccggggt acccaactcc 22200 

atgctcaaca gtccccaggt acagcccacc ctgcgtcgca accaggaaca gctctacagc 22260 

ttcctggagc gccactcgcc ctacttccgc agccacagtg cgcagattag gagcgccact 22320 

tctttttgtc acttgaaaaa catgtaaaaa taatgtacta gagacacttt caataaaggc 22380 

aaatgctttt atttgtacac tctcgggtga ttatttaccc ccacccttgc cgtctgcgcc 22440 

gtttaaaaat caaaggggtt ctgccgcgca tcgctatgcg ccactggcag ggacacgttg 22500 

cgatactggt gtttagtgct ccacttaaac tcaggcacaa ccatccgcgg cagctcggtg 22560 

aagttttcac tccacaggct gcgcaccatc accaacgcgt ttagcaggtc gggcgccgat 22620 

atcttgaagt cgcagttggg gcctccgccc tgcgcgcgcg agttgcgata cacagggttg 22680 

cagcactgga acactatcag cgccgggtgg tgcacgctgg ccagcacgct cttgtcggag 22740 

atcagatccg cgtccaggtc ctccgcgttg ctcagggcga acggagtcaa ctttggtagc 22800 

tgccttccca aaaagggcgc gtgcccaggc tttgagttgc actcgcaccg tagtggcatc 22860 

aaaaggtgac cgtgcccggt ctgggcgtta ggatacagcg cc tgcataaa agccttgatc 22920 

tgcttaaaag ccacctgagc ctttgcgcct tcagagaaga acatgccgca agacttgccg 22980 

gaaaactgat tggccggaca ggccgcgtcg tgcacgcagc accttgcgtc ggtgttggag 23040 

atctgcacca catttcggcc ccaccggttc ttcacgatct tggccttgct agactgctcc 23100 

ttcagcgcgc gctgcccgtt ttcgctcgtc acatccattt caatcacgtg ctccttattt 23160 

atcataatgc ttccgtgtag acacttaagc tcgccttcga tctcagcgca gcggtgcagc 23220 

cacaacgcgc agcccgtggg ctcgtgatgc ttgtaggtca cctctgcaaa cgactgcagg 23280 

tacgcctgca ggaatcgccc catcatcgtc acaaaggtct tgttgctggt gaaggtcagc 23340 

tgcaacccgc ggtgctcctc gttcagccag gtcttgcata cggccgccag agcttccact 23400 

tggtcaggca gtagtttgaa gttcgccttt agatcgttat ccacgtggta cttgtccatc 23460 

agcgcgcgcg cagcctccat gcccttctcc cacgcagaca cgatcggcac actcagcggg 23520 

ttcatcaccg taatttcact ttccgcttcg ctgggctctt cctcttcctc ttgcgtccgc 23580 

ataccacgcg ccactgggtc gtcttcattc agccgccgca ctgtgcgctt acctcctttg 23640 

ccatgcttga ttagcaccgg tgggttgctg aaacccacca tttgtagcgc cacatcttct 23700 

ctttcttcct cgctgtccac gattacctct ggtgatggcg ggcgctcggg cttgggagaa 23760 

gggcgcttct ttttcttctt gggcgcaatg gccaaatccg ccgccgaggt cgatggccgc 23820 

gggctgggtg tgcgcggcac cagcgcgtct tgtgatgagt cttcctcgtc ctcggactcg 23880 

atacgccgcc tcatccgctt ttttgggggc gcccggggag gcggcggcga cggggacggg 23940 

gacgacacgt cctccatggt tgggggacgt cgcgccgcac cgcgtccgcg ctcgggggtg 24000 

gtttcgcgct gctcctcttc ccgactggcc atttccttct cctataggca gaaaaagatc 24060 

atggagtcag tcgagaagaa ggacagccta accgccccct ctgagttcgc caccaccgcc 24120 

tccaccgatg ccgccaacgc gcctaccacc ttccccgtcg aggcaccccc gcttgaggag 24180 

gaggaagtga ttatcgagca ggacccaggt tttgtaagcg aagacgacga ggaccgctca 24240 

gtaccaacag aggataaaaa gcaagaccag gacaacgcag aggcaaacga ggaacaagtc 24300 

gggcgggggg acgaaaggca tggcgactac ctagatgtgg gagacgacgt gctgttgaag 24360 

catctgcagc gccagtgcgc cattatctgc gacgcgttgc aagagcgcag cgatgtgccc 24420 

ctcgccatag cggatgtcag ccttgcctac gaacgccacc tattctcacc gcgcgtaccc 24480 

cccaaacgcc aagaaaacgg cacatgcgag cccaacccgc gcctcaactt ctaccccgta 24540 

tttgccgtgc cagaggtgct tgccacctat cacatctttt tccaaaactg caagataccc 24600 

ctatcctgcc gtgccaaccg cagccgagcg gacaagcagc tggccttgcg gcagggcgct 24660 

gtcatacctg atatcgcctc gctcaacgaa gtgccaaaaa tctttgaggg tcttggacgc 24720 

gacgagaagc gcgcggcaaa cgctctgcaa caggaaaaca gcgaaaatga aagtcactct 24780 

ggagtgttgg tggaactcga gggtgacaac gcgcgcctag ccgtactaaa acgcagcatc 24840 

gaggtcaccc actttgccta cccggcactt aacctacccc ccaaggtcat gagcacagtc 24900 
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atgagtgagc 
caaacagagg 
cgcgagcctg 
gtggagcttg 
gaaacattgc 
gtggagctct 
aacgtgcttc 
tacttatttc 
gagtgcaacc 
gccttcaacg 
cttaaaaccc 
aggaacttta 
gactttgtgc 
ctgcagctag 
ggtctactgg 
aattcgcagc 
cctgacgaaa 
taccttcgca 
caatcccgcc 
ggccaattgc 
gtttacttgg 
tatcagcagc 
gccgccgcca 
cgaggaggag 
cgaagaggtg 
gaaatcggca 
gcccgttcgc 
gcagccgccg 
gcacaagaac 
ccgctttctt 
tcatctctac 
cacagaagca 
cggcagcagc 
agcttagaaa 
aacaagagct 
acaaaagcga 
actgcgcgct 
tacgtcatct 
aggaaattcc 
ctgcccaaga 
gggtcaacgg 
ccacacctcg 
gtcccgctcc 
actcaggggc 
taactcacct 
cgcttggtct 
cgcctcgtca 
ttggaactct 
gacctcccgg 
cggacggcta 
tccactgtcg 
tgcccgagga 
ttgcccgtag 
gaccctgtgt 
gttgccatct 



tgatcgtgcg 
agggcctacc 
ccgacttgga 
agtgcatgca 
actacacctt 
gcaacctggt 
attccacgct 
tatgctacac 
tcaaggagct 
agcgctccgt 
tgcaacaggg 
tcctagagcg 
ccattaagta 
ccaactacct 
agtgtcactg 
tgcttaacga 
agtccgcggc 
aatttgtacc 
cgccaaatgc 
aagccatcaa 
acccccagtc 
agccgcgggc 
cccacggacg 
gaggacatga 
tcagacgaaa 
accggttcca 
cgacccaacc 
ccgttagccc 
gccatagttg 
ctctaccatc 
agcccatact 
aaggcgaccg 
aggaggagga 
caggattttt 
gaaaataaaa 
agatcagctt 
gactcttaag 
ccagcggcca 
cacgccctac 
ctactcaacc 
aatccgcgcc 
taataacctt 
caccactgtg 
gcagcttgcg 
gacaatcaga 
ccgtccggac 
ggcaatccta 
gcaatttatt 
ccactatccg 
cgactgaatg 
ccgccacaag 
tcatatcgag 
cctgattcgg 
tctcactgtg 
ctgtgctgag 



ccgtgcgcag 

cgcagttggc 

ggagcgacgc 

gcggttcttt 

tcgacagggc 

ctcctacctt 

caagggcgag 

ctggcagacg 

gcagaaactg 

ggccgcgcac 

tctgccagac 

ctcaggaatc 

ccgcgaatgc 

tgcctaccac 

tcgctgcaac 

aagtcaaatt 

tccggggttg 

tgaggactac 

ggagcttacc 

caaagcccgc 

cggcgaggag 

ccttgcttcc 

aggaggaata 

tggaagactg 

caccgtcacc 

gcatggctac 

gtagatggga 

aagagcaaca 

cttgcttgca 

acggcgtggc 

gcaccggcgg 

gatagcaaga 

gcgctgcgtc 

cccactctgt 

aacaggtctc 

cggcgcacgc 

gactagtttc 

cacccggcgc 

atgtggagtt 

cgaataaact 

caccgaaacc 

aatccccgta 

gtacttccca 

ggcggctttc 

gggcgaggta 

gggacatttc 

actctgcaga 

gaggagtttg 

gatcaattta 

ttaagtggag 

tgctttgccc 

ggcccggcgc 

gagtttaccc 

atttgcaact 

tataataaat 



cccctggaga 

gacgagcagc 

aaactaatga 

gctgacccgg 

tacgtacgcc 

ggaattttgc 

gcgcgccgcg 

gccatgggcg 

ctaaagcaaa 

ctggcggaca 

ttcaccagtc 

ttgcccgcca 

cctccgccgc 

tctgacataa 

ctatgcaccc 

atcggtacct 

aaactcactc 

cacgcccacg 

gcctgcgtca 

caagagtttc 

ctcaacccaa 

caggatggca 

ctgggacagt 

ggagagccta 

ctcggtcgca 

aacctccgct 

caccactgga 

acagcgccaa 

agactgtggg 

cttcccccgt 

cagcggcagc 

ctctgacaaa 

tggcgcccaa 

atgctatatt 

tgcgatccct 

tggaagacgc 

gcgccctttc 

cagcacctgt 

accagccaca 

acatgagcgc 

gaattctctt 

gttggcccgc 

gagacgccca 

gtcacagggt 

ttcagctcaa 

agatcggcgg 

cctcgtcctc 

tgccatcggt 

ttcctaactt 

aggcagagca 

gcgactccgg 

acggcgtccg 

agcgccccct 

gtcctaacct 

acagaaatta 



gggatgcaaa 

tagcgcgctg 

tggccgcagt 

agatgcagcg 

aggcctgcaa 

acgaaaaccg 

actacgtccg 

tttggcagca 

acttgaagga 

tcattttccc 

aaagcatgtt 

cctgctgtgc 

tttggggcca 

tggaagacgt 

cgcaccgctc 

ttgagctgca 

cggggctgtg 

agattaggtt 

ttacccaggg 

tgctacgaaa 

tccccccgcc 

cccaaaaaga 

caggcagagg 

gacgaggaag 

ttcccctcgc 

cctcaggcgc 

accagggccg 

ggctaccgct 

ggcaacatct 

aacatcctgc 

ggcagcaaca 

gcccaagaaa 

cgaacccgta 

tcaacagagc 

cacccgcagc 

ggaggctctc 

tcaaatttaa 

cgtcagcgcc 

aatgggactt 

gggaccccac 

ggaacaggcg 

tgccctggtg 

ggccgaagtt 

gcggtcgccc 

cgacgagtcg 

cgccggccgt 

tgagccgcgc 

ctactttaac 

tgacgcggta 

actgcgcctg 

tgagttttgc 

gcttaccgcc 

gctagttgag 

tggattacat 

aaatatactg 



tttgcaagaa 

gcttcaaacg 

gctcgttacc 

caagctagag 

gatctccaac 

ccttgggcaa 

cgactgcgtt 

gtgcttggag 

cctatggacg 

cgaacgcctg 

gcagaacttt 

acttcctagc 

ctgctacctt 

gagcggtgac 

cctggtttgc 

gggtccctcg 

gacgtcggct 

ctacgaagac 

ccacattctt 

gggacggggg 

gccgcagccc 

agctgcagct 

aggttttgga 

cttccgaggt 

cggcgcccca 

cgccggcact 

gtaagtccaa 

catggcgcgg 

ccttcgcccg 

attactaccg 

gcagcggcca 

tccacagcgg 

tcgacccgcg 

aggggccaag 

tgcctgtatc 

ttcagtaaat 

gcgcgaaaac 

attatgagca 

gcggctggag 

atgatatccc 

gctattacca 

taccaggaaa 

cagatgacta 

gggcagggta 

gtgagctcct 

ccttcattca 

tctggaggca 

cccttctcgg 

aaggactcgg 

aaacacctgg 

tactttgaat 

cagggagagc 

cgggacaggg 

caagatcttt 

gggctcctat 



24960 

25020 

25080 

25140 

25200 

25260 

25320 

25380 

25440 

25500 

25560 

25620 

25680 

25740 

25800 

25860 

25920 

25980 

26040 

26100 

26160 

26220 

26280 

26340 

26400 

26460 

26520 

26580 

26640 

26700 

26760 

26820 

26880 

26940 

27000 

27060 

27120 

27180 

27240 

27300 

27360 

27420 

27480 

27540 

27600 

27660 

27720 

27780 

27840 

27900 

27960 

28020 

28080 

28140 

28200 
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cgccatcctg taaacgccac cgtcttcacc cgcccaagca aaccaaggcg aaccttacct 282 60 

ggtactttta acatctctcc ctctgtgatt tacaacagtt tcaacccaga cggagtgagt 28320 

ctacgagaga acctctccga gctcagctac tccatcagaa aaaacaccac cctccttacc 28380 

tgccgggaac gtacgagtgc gtcaccggcc gctgcaccac acctaccgcc tgaccgtaaa 28440 

ccagactttt tccggacaga cctcaataac tctgtttacc agaacaggag gtgagcttag 28500 

aaaaccctta gggtattagg ccaaaggcgc agctactgtg gggtttatga acaattcaag 28560 

caactctacg ggctattcta attcaggttt ctctagaatc ggggttgggg ttattctctg 28620 

tcttgtgatt ctctttattc ttatactaac gcttctctgc ctaaggctcg ccgcctgctg 28680 

tgtgcacatt tgcatttatt gtcagctttt taaacgctgg ggtcgccacc caagatgatt 28740 

aggtacataa tcctaggttt actcaccctt gcgtcagccc acggtaccac ccaaaaggtg 28800 

gattttaagg agccagcctg taatgttaca ttcgcagctg aagctaatga gtgcaccact 28860 

cttataaaat gcaccacaga acatgaaaag ctgcttattc gccacaaaaa caaaattggc 28920 

aagtatgctg tttatgctat ttggcagcca ggtgacacta cagagtataa tgttacagtt 28980 

ttccagggta aaagtcataa aacttttatg tatacttttc cattttatga aatgtgcgac 29040 

attaccatgt acatgagcaa acagtataag ttgtggcccc cacaaaattg tgtggaaaac 29100 

actggcactt tctgctgcac tgctatgcta attacagtgc tcgctttggt ctgtacccta 29160 

ctctatatta aatacaaaag cagacgcagc tttattgagg aaaagaaaat gccttaattt 29220 

actaagttac aaagctaatg tcaccactaa ctgctttact cgctgcttgc aaaacaaatt 29280 

caaaaagtta gcattataat tagaatagga tttaaacccc ccggtcattt cctgctcaat 29340 

accattcccc tgaacaattg actctatgtg ggatatgctc cagcgctaca accttgaagt 29400 

caggcttcct ggatgtcagc atctgacttt ggccagcacc tgtcccgcgg atttgttcca 29460 

gtccaactac agcgacccac cctaacagag atgaccaaca caaccaacgc ggccgccgct 29520 

accggactta catctaccac aaatacaccc caagtttctg cctttgtcaa taactgggat 29580 

aacttgggca tgtggtggtt ctccatagcg cttatgtttg tatgccttat tattatgtgg 29640 

ctcatctgct gcctaaagcg caaacgcgcc cgaccaccca tctatagtcc catcattgtg 29700 

ctacacccaa acaatgatgg aatccataga ttggacggac tgaaacacat gttcttttct 29760 

cttacagtat gattaaatga gacatgattc ctcgagtttt tatattactg acccttgttg 29820 

cgcttttttg tgcgtgctcc acattggctg cggtttctca catcgaagta gactgcattc 29880 

cagccttcac agtctatttg ctttacggat ttgtcaccct cacgctcatc tgcagcctca 29940 

tcactgtggt catcgccttt atccagtgca ttgactgggt ctgtgtgcgc tttgcatatc 30000 

tcagacacca tccccagtac agggacagga ctatagctga gcttcttaga attctttaat 30060 

tatgaaattt actgtgactt ttctgctgat tatttgcacc ctatctgcgt tttgttcccc 30120 

gacctccaag cctcaaagac atatatcatg cagattcact cgtatatgga atattccaag 30180 

ttgctacaat gaaaaaagcg atctttccga agcctggtta tatgcaatca tctctgttat 30240 

ggtgttctgc agtaccatct tagccctagc tatatatccc taccttgaca ttggctggaa 30300 

acgaatagat gccatgaacc acccaacttt ccccgcgccc gctatgcttc cactgcaaca 30360 

agttgttgcc ggcggctttg tcccagccaa tcagcctcgc cccacttctc ccacccccac 30420 

tgaaatcagc tactttaatc taacaggagg agatgactga caccctagat ctagaaatgg 30480 

acggaattat tacagagcag cgcctgctag aaagacgcag ggcagcggcc gagcaacagc 30540 

gcatgaatca agagctccaa gacatggtta acttgcacca gtgcaaaagg ggtatctttt 30600 

gtctggtaaa gcaggccaaa gtcacctacg acagtaatac caccggacac cgccttagct 30660 

acaagttgcc aaccaagcgt cagaaattgg tggtcatggt gggagaaaag cccattacca 30720 

taactcagca ctcggtagaa accgaaggct gcattcactc accttgtcaa ggacctgagg 30780 

atctctgcac ccttattaag accctgtgcg gtctcaaaga tcttattccc tttaactaat 30840 

aaaaaaaaat aataaagcat cacttactta aaatcagtta gcaaatttct gtccagttta 30900 

ttcagcagca cctccttgcc ctcctcccag ctctggtatt gcagcttcct cctggctgca 30960 

aactttctcc acaatctaaa tggaatgtca gtttcctcct gttcctgtcc atccgcaccc 31020 

actatcttca tgttgttgca gatgaagcgc gcaagaccgt ctgaagatac cttcaacccc 31080 

gtgtatccat atgacacgga aaccggtcct ccaactgtgc cttttcttac tcctcccttt 31140 

gtatccccca atgggtttca agagagtccc cctggggtac tctctttgcg cctatccgaa 31200 

cctctagtta cctccaatgg catgcttgcg ctcaaaatgg gcaacggcct ctctctggac 31260 

gaggccggca accttacctc ccaaaatgta accactgtga gcccacctct caaaaaaacc 31320 

aagtcaaaca taaacctgga aatatctgca cccctcacag ttacctcaga agccctaact 31380 

gtggctgccg ccgcacctct aatggtcgcg ggcaacacac tcaccatgca atcacaggcc 31440 

ccgctaaccg tgcacgactc caaacttagc attgccaccc aaggacccct cacagtgtca 31500 
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gaaggaaagc 

actatcactg 

gagcccattt 

acagacgacc 

tccttgcaaa 

aatgtagcag 

tatccgtttg 

aactcagccc 

aacaattcca 

acagccatag 

acaaatcccc 

gttcctaaac 

aaaaataatg 

aatgcagaga 

gctacagttt 

agtgctcatc 

gacccagaat 

gctgttggat 

agtaacattg 

attacactaa 

ttttcatggg 

actttttcat 

ttttcaattg 

tagcttatac 

acctccctcc 

catatcatgg 

caaacgctca 

gtccagctgc 

agaagtccac 

ctgcagcagc 

ggcagtggtc 

ggcacagcag 

aatattgttc 

agaacccacg 

cacgctggac 

tataaacctc 

ctgcccgccg 

ggactcgtaa 

cacgtgcata 

aacaacccat 

cacgttgtgc 

agcgcgggtt 

caaccgagat 

tcctgaagca 

tagatcgctc 

tggcttcggg 

cagaataagc 

cgggaagagc 

aaatgaagat 

aaagaacaga 

ctcacgtcca 

ccagcacctt 

agcaaatccc 

ttcagcctca 

gattcaaaag 



tagccctgca 

cctcaccccc 

atacacaaaa 

taaacacttt 

ctaaagttac 

gaggactaag 

atgctcaaaa 

acaacttgga 

aaaagcttga 

ccattaatgc 

tcaaaacaaa 

taggaactgg 

ataagctaac 

aagatgctaa 

cagttttggc 

ttattataag 

attggaactt 

ttatgcctaa 

tcagtcaagt 

acggtacaca 

actggtctgg 

acattgccca 

cagaaaattt 

agatcaccgt 

caacacacag 

gtaacagaca 

tcagtgatat 

tgagccacag 

gcctacatgg 

gcgcgaataa 

tcctcagcga 

cgcaccctga 

aaaatcccac 

tggccatcat 

ataaacatta 

tgattaaaca 

gctatacact 

ccatggatca 

cacttcctca 

tcctgaatca 

attgtcaaag 

tctgtctcaa 

cgtgttggtc 

aaaccaggtg 

tgtgtagtag 

ttctatgtaa 

cacacccagc 

tggaagaacc 

ctattaagtg 

taatggcatt 

agtggacgta 

caaccatgcc 

gaatattaag 

agcagcgaat 

cggaacatta 



aacatcaggc 

tctaactact 

tggaaaacta 

gaccgtagca 

tggagccttg 

gattgattct 

ccaactaaat 

tattaactac 

ggttaaccta 

aggagatggg 

aattggccat 

ccttagtttt 

tttgtggacc 

actcactttg 

tgttaaaggc 

atttgacgaa 

tagaaatgga 

cctatcagct 

ttacttaaac 

ggaaacagga 

ccacaactac 

agaataaaga 

caagtcattt 

accttaatca 

agtacacagt 

tattcttagg 

taataaactc 

gctgctgtcc 

gggtagagtc 

actgctgccg 

tgattcgcac 

tctcacttaa 

agtgcaaggc 

accacaagcg 

cctcttttgg 

tggcgccatc 

gcagggaacc 

tcatgctcgt 

ggattacaag 

gcgtaaatcc 

tgttacattc 

aaggaggtag 

gtagtgtcat 

cgggcgtgac 

ttgtagtata 

actccttcat 

caacctacac 

atgttttttt 

aacgcgctcc 

tgtaagatgt 

aaggctaaac 

caaataattc 

tccggccatt 

catgattgca 

acaaaaatac 



cccctcacca 

gccactggta 

ggactaaagt 

actggtccag 

ggttttgatt 

caaaacagac 

ctaagactag 

aacaaaggcc 

agcactgcca 

cttgaatttg 

ggcctagaat 

gacagcacag 

acaccagctc 

gtcttaacaa 

agtttggctc 

aatggagtgc 

gatcttactg 

tatccaaaat 

ggagacaaaa 

gacacaactc 

attaatgaaa 

atcgtttgtg 

ttcattcagt 

aactcacaga 

cctttctccc 

tgttatattc 

cccgggcagc 

aacttgcggt 

ataatcgtgc 

ccgccgctcc 

cgcccgcagc 

atcagcacag 

gctgtatcca 

caggtagatt 

catgttgtaa 

caccaccatc 

gggactggaa 

catgatatca 

ctcctcccgc 

cacactgcag 

gggcagcagc 

acgatcccta 

gccaaatgga 

aaacagatct 

tccactctct 

gcgccgctgc 

attcgttctg 

ttttattcca 

cctccggtgg 

tgcacaatgg 

ccttcagggt 

tcatctcgcc 

gtaaaaatct 

aaaattcagg 

cgcgatcccg 



ccaccgatag 

gcttgggcat 

acggggctcc 

gtgtgactat 

cacaaggcaa 

gccttatact 

gacagggccc 

tttacttgtt 

aggggttgat 

gttcacctaa 

ttgattcaaa 

gtgccattac 

catctcctaa 

aatgtggcag 

caatatctgg 

tactaaacaa 

aaggcacagc 

ctcacggtaa 

ctaaacctgt 

caagtgcata 

tatttgccac 

ttatgtttca 

agtatagccc 

accctagtat 

cggctggcct 

cacacggttt 

tcacttaagt 

tgcttaacgg 

atcaggatag 

gtcctgcagg 

ataaggcgcc 

taactgcagc 

aagctcatgg 

aagtggcgac 

ttcaccacct 

ctaaaccagc 

caatgacagt 

atgttggcac 

gttagaacca 

ggaagacctc 

ggatgatcct 

ctgtacggag 

acgccggacg 

gcgtctccgg 

caaagcatcc 

cctgataaca 

cgagtcacac 

aaagattatc 

cgtggtcaaa 

cttccaaaag 

gaatctcctc 

accttctcaa 

gctccagagc 

ttcctcacag 

taggtccctt 



cagtaccctt 

tgacttgaaa 

tttgcatgta 

taataatact 

tatgcaactt 

tgatgttagt 

tctttttata 

tacagcttca 

gtttgacgct 

tgcaccaaac 

caaggctatg 

agtaggaaac 

ctgtagacta 

tcaaatactt 

aacagttcaa 

ttccttcctg 

ctatacaaac 

aactgccaaa 

aacactaacc 

ctctatgtca 

atcctcttac 

acgtgtttat 

caccaccaca 

tcaacctgcc 

taaaaagcat 

cctgtcgagc 

tcatgtcgct 

gcggcgaagg 

ggcggtggtg 

aatacaacat 

ttgtcctccg 

acagcaccac 

cggggaccac 

ccctcataaa 

cccggtacca 

tggccaaaac 

ggagagccca 

aacacaggca 

tatcccaggg 

gcacgtaact 

ccagtatggt 

tgcgccgaga 

tagtcatatt 

tctcgccgct 

aggcgccccc 

tccaccaccg 

acgggaggag 

caaaacctca 

ctctacagcc 

gcaaacggcc 

tataaacatt 

tatatctcta 

gccctccacc 

acctgtataa 

cgcagggcca 



31560 

31620 

31680 

31740 

31800 

31860 

31920 

31980 

32040 

32100 

32160 

32220 

32280 

32340 

32400 

32460 

32520 

32580 

32640 

32700 

32760 

32820 

32880 

32940 

33000 

33060 

33120 

33180 

33240 

33300 

33360 

33420 

33480 

33540 

33600 

33660 

33720 

33780 

33840 

33900 

33960 

34020 

34080 

34140 

34200 

34260 

34320 

34380 

34440 

34500 

34560 

34620 

34680 

34740 

34800 
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gctgaacata atcgtgcagg tctgcacgga ccagcgcggc cacttccccg ccaggaacct 34860 

tgacaaaaga acccacactg attatgacac gcatactcgg agctatgcta accagcgtag 34920 

ccccgatgta agctttgttg catgggcggc gatataaaat gcaaggtgct gctcaaaaaa 34980 

tcaggcaaag cctcgcgcaa aaaagaaagc acatcgtagt catgctcatg cagataaagg 35040 

caggtaagct ccggaaccac cacagaaaaa gacaccattt ttctctcaaa catgtctgcg 35100 

ggtttctgca taaacacaaa ataaaataac aaaaaaacat ttaaacatta gaagcctgtc 35160 

ttacaacagg aaaaacaacc cttataagca taagacggac tacggccatg ccggcgtgac 35220 

cgtaaaaaaa ctggtcaccg tgattaaaaa gcaccaccga cagctcctcg gtcatgtccg 35280 

gagtcataat gtaagactcg gtaaacacat caggttgatt catcggtcag tgctaaaaag 35340 

cgaccgaaat agcccggggg aatacatacc cgcaggcgta gagacaacat tacagccccc 35400 

ataggaggta taacaaaatt aataggagag aaaaacacat aaacacctga aaaaccctcc 35460 

tgcctaggca aaatagcacc ctcccgctcc agaacaacat acagcgcttc acagcggcag 35520 

cctaacagtc agccttacca gtaaaaaaga aaacctatta aaaaaacacc actcgacacg 35580 

gcaccagctc aatcagtcac agtgtaaaaa agggccaagt gcagagcgag tatatatagg 35640 

actaaaaaat gacgtaacgg ttaaagtcca caaaaaacac ccagaaaacc gcacgcgaac 35700 

ctacgcccag aaacgaaagc caaaaaaccc acaacttcct caaatcgtca cttccgtttt 35760 

cccacgttac gtaacttccc attttaagaa aactacaatt cccaacacat acaagttact 35820 

ccgccctaaa acctacgtca cccgccccgt tcccacgccc cgcgccacgt cacaaactcc 35880 

accccctcat tatcatattg gcttcaatcc aaaataaggt atattattga tgatg 35935 

<210> 9 
<211> 35935 
<212> DNA 

<213> Adenovirus serotype 5 
<400> 9 

catcatcaat aatatacctt attttggatt gaagccaata tgataatgag ggggtggagt 60 

ttgtgacgtg gcgcggggcg tgggaacggg gcgggtgacg tagtagtgtg gcggaagtgt 120 

gatgttgcaa gtgtggcgga acacatgtaa gcgacggatg tggcaaaagt gacgtttttg 180 

gtgtgcgccg gtgtacacag gaagtgacaa ttttcgcgcg gttttaggcg gatgttgtag 240 

taaatttggg cgtaaccgag taagatttgg ccattttcgc gggaaaactg aataagagga 300 

agtgaaatct gaataatttt gtgttactca tagcgcgtaa tatttgtcta gggccgcggg 360 

gactttgacc gtttacgtgg agactcgccc aggtgttttt ctcaggtgtt ttccgcgttc 420 

cgggtcaaag ttggcgtttt attattatag tcagctgacg tgtagtgtat ttatacccgg 480 

tgagttcctc aagaggccac tcttgagtgc cagcgagtag agttttctcc tccgagccgc 540 

tccgacaccg ggactgaaaa tgagacatat tatctgccac ggaggtgtta ttaccgaaga 600 

aatggccgcc agtcttttgg accagctgat cgaagaggta ctggctgata atcttccacc 660 

tcctagccat tttgaaccac ctacccttca cgaactgtat gatttagacg tgacggcccc 720 

cgaagatccc aacgaggagg cggtttcgca gatttttccc gactctgtaa tgttggcggt 780 

gcaggaaggg attgacttac tcacttttcc gccggcgccc ggttctccgg agccgcctca 840 

cctttcccgg cagcccgagc agccggagca gagagccttg ggtccggttt ctatgccaaa 900 

ccttgtaccg gaggtgatcg atcttacctg ccacgaggct ggctttccac ccagtgacga 960 

cgaggatgaa gagggtgagg agtttgtgtt agattatgtg gagcaccccg ggcacggttg 1020 

caggtcttgt cattatcacc ggaggaatac gggggaccca gatattatgt gttcgctttg 1080 

ctatatgagg acctgtggca tgtttgtcta cagtaagtga aaattatggg cagtgggtga 1140 

tagagtggtg ggtttggtgt ggtaattttt tttttaattt ttacagtttt gtggtttaaa 1200 

gaattttgta ttgtgatttt tttaaaaggt cctgtgtctg aacctgagcc tgagcccgag 1260 

ccagaaccgg agcctgcaag acctacccgc cgtcctaaaa tggcgcctgc tatcctgaga 1320 

cgcccgacat cacctgtgtc tagagaatgc aatagtagta cggatagctg tgactccggt 1380 

ccttctaaca cacctcctga gatacacccg gtggtcccgc tgtgccccat taaaccagtt 1440 

gccgtgagag ttggtgggcg tcgccaggct gtggaatgta tcgaggactt gcttaacgag 1500 

cctgggcaac ctttggactt gagctgtaaa cgccccaggc cataaggtgt aaacctgtga 1560 

ttgcgtgtgt ggttaacgcc tttgtttgct gaatgagttg atgtaagttt aataaagggt 1620 

gagataatgt ttaacttgca tggcgtgtta aatggggcgg ggcttaaagg gtatataatg 1680 

cgccgtgggc taatcttggt tacatctgac ctcatggagg cttgggagtg tttggaagat 1740 
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ttttctgctg tgcgtaactt gctggaacag agctctaaca gtacctcttg gttttggagg 1800 

tttctgtggg gctcatccca ggcaaagtta gtctgcagaa ttaaggagga ttacaagtgg 1860 

gaatttgaag agcttttgaa atcctgtggt gagctgtttg attctttgaa tctgggtcac 1920 

caggcgcttt tccaagagaa ggtcatcaag actttggatt tttccacacc ggggcgcgct 1980 

gcggctgctg ttgctttttt gagttttata aaggataaat ggagcgaaga aacccatctg 2 040 

agcggggggt acctgctgga ttttctggcc atgcatctgt ggagagcggt tgtgagacac 2100 

aagaatcgcc tgctactgtt gtcttccgtc cgcccggcga taataccgac ggaggagcag 2160 

cagcagcagc aggaggaagc caggcggcgg cggcaggagc agagcccatg gaacccgaga 2220 

gccggcctgg accctcggga atgaatgttg tacaggtggc tgaactgtat ccagaactga 2280 

gacgcatttt gacaattaca gaggatgggc aggggctaaa gggggtaaag agggagcggg 2340 

gggcttgtga ggctacagag gaggctagga atctagcttt tagcttaatg accagacacc 2400 

gtcctgagtg tattactttt caacagatca aggataattg cgctaatgag cttgatctgc 2460 

tggcgcagaa gtattccata gagcagctga ccacttactg gctgcagcca ggggatgatt 2520 

ttgaggaggc tattagggta tatgcaaagg tggcacttag gccagattgc aagtacaaga 2580 

tcagcaaact tgtaaatatc aggaattgtt gctacatttc tgggaacggg gccgaggtgg 2640 

agatagatac ggaggatagg gtggccttta gatgtagcat gataaatatg tggccggggg 2700 

tgcttggcat ggacggggtg gttattatga atgtaaggtt tactggcccc aattttagcg 2760 

gtacggtttt cctggccaat accaacctta tcctacacgg tgtaagcttc tatgggttta 2820 

acaatacctg tgtggaagcc tggaccgatg taagggttcg gggctgtgcc ttttactgct 2880 

gctggaaggg ggtggtgtgt cgccccaaaa gcagggcttc aattaagaaa tgcctctttg 2940 

aaaggtgtac cttgggtatc ctgtctgagg gtaactccag ggtgcgccac aatgtggcct 3000 

ccgactgtgg ttgcttcatg ctagtgaaaa gcgtggctgt gattaagcat aacatggtat 3060 

gtggcaactg cgaggacagg gcctctcaga tgctgacctg ctcggacggc aactgtcacc 3120 

tgctgaagac cattcacgta gccagccact ctcgcaaggc ctggccagtg tttgagcata 3180 

acatactgac ccgctgttcc ttgcatttgg gtaacaggag gggggtgttc ctaccttacc 3240 

aatgcaattt gagtcacact aagatattgc ttgagcccga gagcatgtcc aaggtgaacc 3300 

tgaacggggt gtttgacatg accatgaaga tctggaaggt gctgaggtac gatgagaccc 3360 

gcaccaggtg cagaccctgc gagtgtggcg gtaaacatat taggaaccag cctgtgatgc 3420 

tggatgtgac cgaggagctg aggcccgatc acttggtgct ggcctgcacc cgcgctgagt 3480 

ttggctctag cgatgaagat acagattgag gtactgaaat gtgtgggcgt ggcttaaggg 3540 

tgggaaagaa tatataaggt gggggtctta tgtagttttg tatctgtttt gcagcagccg 3600 

ccgccgccat gagcaccaac tcgtttgatg gaagcattgt gagctcatat ttgacaacgc 3660 

gcatgccccc atgggccggg gtgcgtcaga atgtgatggg ctccagcatt gatggtcgcc 3720 

ccgtcctgcc cgcaaactct actaccttga cctacgagac cgtgtctgga acgccgttgg 3780 

agactgcagc ctccgccgcc gcttcagccg ctgcagccac cgcccgcggg attgtgactg 3840 

actttgcttt cctgagcccg cttgcaagca gtgcagcttc ccgttcatcc gcccgcgatg 3900 

acaagttgac ggctcttttg gcacaattgg attctttgac ccgggaactt aatgtcgttt 3960 

ctcagcagct gttggatctg cgccagcagg tttctgccct gaaggcttcc tcccctccca 4020 

atgcggttta aaacataaat aaaaaaccag actctgtttg gatttggatc aagcaagtgt 4080 

cttgctgtct ttatttaggg gttttgcgcg cgcggtaggc ccgggaccag cggtctcggt 4140 

cgttgagggt cctgtgtatt. ttttccagga cgtggtaaag gtgactctgg atgttcagat 4200 

acatgggcat aagcccgtct ctggggtgga ggtagcacca ctgcagagct tcatgctgcg 4260 

gggtggtgtt gtagatgatc cagtcgtagc aggagcgctg ggcgtggtgc ctaaaaatgt 4320 

ctttcagtag caagctgatt gccaggggca ggcccttggt gtaagtgttt acaaagcggt 4380 

taagctggga tgggtgcata cgtggggata tgagatgcat cttggactgt atttttaggt 4440 

tggctatgtt cccagccata tccctccggg gattcatgtt gtgcagaacc accagcacag 4500 

tgtatccggt gcacttggga aatttgtcat gtagcttaga aggaaatgcg tggaagaact 4560 

tggagacgcc cttgtgacct ccaagatttt ccatgcattc gtccataatg atggcaatgg 4620 

gcccacgggc ggcggcctgg gcgaagatat ttctgggatc actaacgtca tagttgtgtt 4680 

ccaggatgag atcgtcatag gccattttta caaagcgcgg gcggagggtg ccagactgcg 4740 

gtataatggt tccatccggc ccaggggcgt agttaccctc acagatttgc atttcccacg 4800 

ctttgagttc agatgggggg atcatgtcta cctgcggggc gatgaagaaa acggtttccg 4860 

gggtagggga gatcagctgg gaagaaagca ggttcctgag cagctgcgac ttaccgcagc 4920 

cggtgggccc gtaaatcaca cctattaccg ggtgcaactg gtagttaaga gagctgcagc 4980 

tgccgtcatc cctgagcagg ggggccactt cgttaagcat gtccctgact cgcatgtttt 5040 
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ccctgaccaa atccgccaga aggcgctcgc cgcccagcga tagcagttct tgcaaggaag 5100 

caaagttttt caacggtttg agaccgtccg ccgtaggcat gcttttgagc gtttgaccaa 5160 

gcagttccag gcggtcccac agctcggtca cctgctctac ggcatctcga tccagcatat 5220 

ctcctcgttt cgcgggttgg ggcggctttc gctgtacggc agtagtcggt gctcgtccag 5280 

acgggccagg gtcatgtctt tccacgggcg cagggtcctc gtcagcgtag tctgggtcac 5340 

ggtgaagggg tgcgctccgg gctgcgcgct ggccagggtg cgcttgaggc tggtcctgct 5400 

ggtgctgaag cgctgccggt cttcgccctg cgcgtcggcc aggtagcatt tgaccatggt 5460 

gtcatagtcc agcccctccg cggcgtggcc cttggcgcgc agcttgccct tggaggaggc 5520 

gccgcacgag gggcagtgca gacttttgag ggcgtagagc ttgggcgcga gaaataccga 5580 

ttccggggag taggcatccg cgccgcaggc cccgcagacg gtctcgcatt ccacgagcca 5640 

ggtgagctct ggccgttcgg ggtcaaaaac caggtttccc ccatgctttt tgatgcgttt 5700 

cttacctctg gtttccatga gccggtgtcc acgctcggtg acgaaaaggc tgtccgtgtc 5760 

cccgtataca gacttgagag gcctgtcctc gagcggtgtt ccgcggtcct cctcgtatag 5820 

aaactcggac cactctgaga caaaggctcg cgtccaggcc agcacgaagg aggctaagtg 5880 

ggaggggtag cggtcgttgt ccactagggg gtccactcgc tccagggtgt gaagacacat 5940 

gtcgccctct tcggcatcaa ggaaggtgat tggtttgtag gtgtaggcca cgtgaccggg 6000 

tgttcctgaa ggggggctat aaaagggggt gggggcgcgt tcgtcctcac tctcttccgc 6060 

atcgctgtct gcgagggcca gctgttgggg tgagtactcc ctctgaaaag cgggcatgac 6120 

ttctgcgcta agattgtcag tttccaaaaa cgaggaggat ttgatattca cctggcccgc 6180 

ggtgatgcct ttgagggtgg ccgcatccat ctggtcagaa aagacaatct ttttgttgtc 6240 

aagcttggtg gcaaacgacc cgtagagggc gttggacagc aacttggcga tggagcgcag 6300 

ggtttggttt ttgtcgcgat cggcgcgctc cttggccgcg atgtttagct gcacgtattc 6360 

gcgcgcaacg caccgccatt cgggaaagac ggtggtgcgc tcgtcgggca ccaggtgcac 6420 

gcgccaaccg cggttgtgca gggtgacaag gtcaacgctg gtggctacct ctccgcgtag 6480 

gcgctcgttg gtccagcaga ggcggccgcc cttgcgcgag cagaatggcg gtagggggtc 6540 

tagctgcgtc tcgtccgggg ggtctgcgtc cacggtaaag accccgggca gcaggcgcgc 6600 

gtcgaagtag tctatcttgc atccttgcaa gtctagcgcc tgctgccatg cgcgggcggc 6660 

aagcgcgcgc tcgtatgggt tgagtggggg accccatggc atggggtggg tgagcgcgga 6720 

ggcgtacatg ccgcaaatgt cgtaaacgta gaggggctct ctgagtattc caagatatgt 6780 

agggtagcat cttccaccgc ggatgctggc gcgcacgtaa tcgtatagtt cgtgcgaggg 6840 

agcgaggagg tcgggaccga ggttgctacg ggcgggctgc tctgctcgga agactatctg 6900 

cctgaagatg gcatgtgagt tggatgatat ggttggacgc tggaagacgt tgaagctggc 6960 

gtctgtgaga cctaccgcgt cacgcacgaa ggaggcgtag gagtcgcgca gcttgttgac 7020 

cagctcggcg gtgacctgca cgtctagggc gcagtagtcc agggtttcct tgatgatgtc 7080 

atacttatcc tgtccctttt ttttccacag ctcgcggttg aggacaaact cttcgcggtc 7140 

tttccagtac tcttggatcg gaaacccgtc ggcctccgaa cggtaagagc ctagcatgta 7200 

gaactggttg acggcctggt aggcgcagca tcccttttct acgggtagcg cgtatgcctg 7260 

cgcggccttc cggagcgagg tgtgggtgag cgcaaaggtg tccctgacca tgactttgag 7320 

gtactggtat ttgaagtcag tgtcgtcgca tccgccctgc tcccagagca aaaagtccgt 7380 

gcgctttttg gaacgcggat ttggcagggc gaaggtgaca tcgttgaaga gtatctttcc 7440 

cgcgcgaggc ataaagttgc gtgtgatgcg gaagggtccc ggcacctcgg aacggttgtt 7500 

aattacctgg gcggcgagca cgatctcgtc aaagccgttg atgttgtggc ccacaatgta 7560 

aagttccaag aagcgcggga tgcccttgat ggaaggcaat tttttaagtt cctcgtaggt 7620 

gagctcttca ggggagctga gcccgtgctc tgaaagggcc cagtctgcaa gatgagggtt 7680 

ggaagcgacg aatgagctcc acaggtcacg ggccattagc atttgcaggt ggtcgcgaaa 7740 

ggtcctaaac tggcgaccta tggccatttt ttctggggtg atgcagtaga aggtaagcgg 7800 

gtcttgttcc cagcggtccc atccaaggtt cgcggctagg tctcgcgcgg cagtcactag 7860 

aggctcatct ccgccgaact tcatgaccag catgaagggc acgagctgct tcccaaaggc 7920 

ccccatccaa gtataggtct ctacatcgta ggtgacaaag agacgctcgg tgcgaggatg 7980 

cgagccgatc gggaagaact ggatctcccg ccaccaattg gaggagtggc tattgatgtg 8040 

gtgaaagtag aagtccctgc gacgggccga acactcgtgc tggcttttgt aaaaacgtgc 8100 

gcagtactgg cagcggtgca cgggctgtac atcctgcacg aggttgacct gacgaccgcg 8160 

cacaaggaag cagagtggga atttgagccc ctcgcctggc gggtttggct ggtggtcttc 8220 

tacttcggct gcttgtcctt gaccgtctgg ctgctcgagg ggagttacgg tggatcggac 8280 

caccacgccg cgcgagccca aagtccagat gtccgcgcgc ggcggtcgga gcttgatgac 8340 
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aacatcgcgc agatgggagc tgtccatggt ctggagctcc cgcggcgtca ggtcaggcgg 8400 

gagctcctgc aggtttacct cgcatagacg ggtcagggcg cgggctagat ccaggtgata 8460 

cctaatttcc aggggctggt tggtggcggc gtcgatggct tgcaagaggc cgcatccccg 8520 

cggcgcgact acggtaccgc gcggcgggcg gtgggccgcg ggggtgtcct tggatgatgc 8580 

atctaaaagc ggtgacgcgg gcgagccccc ggaggtaggg ggggctccgg acccgccggg 8640 

agagggggca ggggcacgtc ggcgccgcgc gcgggcagga gctggtgctg cgcgcgtagg 8700 

ttgctggcga acgcgacgac gcggcggttg atctcctgaa tctggcgcct ctgcgtgaag 8760 

acgacgggcc cggtgagctt gagcctgaaa gagagttcga cagaatcaat ttcggtgtcg 8820 

ttgacggcgg cctggcgcaa aatctcctgc acgtctcctg agttgtcttg ataggcgatc 8880 

tcggccatga actgctcgat ctcttcctcc tggagatctc cgcgtccggc tcgctccacg 8940 

gtggcggcga ggtcgttgga aatgcgggcc atgagctgcg agaaggcgtt gaggcctccc 9000 

tcgtxccaga cgcggctgta gaccacgccc ccttcggcat cgcgggcgcg catgaccacc 9060 

tgcgcgagat tgagctccac gtgccgggcg aagacggcgt agtttcgcag gcgctgaaag 9120 

aggtagttga gggtggtggc ggtgtgttct gccacgaaga agtacataac ccagcgtcgc 9180 

aacgtggatt cgttgatatc ccccaaggcc tcaaggcgct ccatggcctc gtagaagtcc 9240 

acggcgaagt tgaaaaactg ggagttgcgc gccgacacgg ttaactcctc ctccagaaga 9300 

cggatgagct cggcgacagt gtcgcgcacc tcgcgctcaa aggctacagg ggcctcttct 9360 

tcttcttcaa tctcctcttc cataagggcc tccccttctt cttcttctgg cggcggtggg 9420 

ggagggggga cacggcggcg acgacggcgc accgggaggc ggtcgacaaa gcgctcgatc 9480 

atctccccgc ggcgacggcg catggtctcg gtgacggcgc ggccgttctc gcgggggcgc 9540 

agttggaaga cgccgcccgt catgtcccgg ttatgggttg gcggggggct gccatgcggc 9600 

agggatacgg cgctaacgat gcatctcaac aattgttgtg taggtactcc gccgccgagg 9660 

gacctgagcg agtccgcatc gaccggatcg gaaaacctct cgagaaaggc gtctaaccag 9720 

tcacagtcgc aaggtaggct gagcaccgtg gcgggcggca gcgggcggcg gtcggggttg 9780 

tttctggcgg aggtgctgct gatgatgtaa ttaaagtagg cggtcttgag acggcggatg 9840 

gtcgacagaa gcaccatgtc cttgggtccg gcctgctgaa tgcgcaggcg gtcggccatg 9900 

ccccaggctt cgttttgaca tcggcgcagg tctttgtagt agtcttgcat gagcctttct 9960 

accggcactt cttcttctcc ttcctcttgt cctgcatctc ttgcatctat cgctgcggcg 10020 

gcggcggagt ttggccgtag gtggcgccct cttcctccca tgcgtgtgac cccgaagccc 10080 

ctcatcggct gaagcagggc taggtcggcg acaacgcgct cggctaatat ggcctgctgc 10140 

acctgcgtga gggtagactg gaagtcatcc atgtccacaa agcggtggta tgcgcccgtg 10200 

ttgatggtgt aagtgcagtt ggccataacg gaccagttaa cggtctggtg acccggctgc 10260 

gagagctcgg tgtacctgag acgcgagtaa gccctcgagt caaatacgta gtcgttgcaa 10320 

gtccgcacca ggtactggta tcccaccaaa aagtgcggcg gcggctggcg gtagaggggc 10380 

cagcgtaggg tggccggggc tccgggggcg agatcttcca acataaggcg atgatatccg 10440 

tagatgtacc tggacatcca ggtgatgccg gcggcggtgg tggaggcgcg cggaaagtcg 10500 

cggacgcggt tccagatgtt gcgcagcggc aaaaagtgct ccatggtcgg gacgctctgg 10560 

ccggtcaggc gcgcgcaatc gttgacgctc tagaccgtgc aaaaggagag cctgtaagcg 10620 

ggcactcttc cgtggtctgg tggataaatt cgcaagggta tcatggcgga cgaccggggt 10680 

tcgagccccg tatccggccg tccgccgtga tccatgcggt taccgcccgc gtgtcgaacc 10740 

caggtgtgcg acgtcagaca acgggggagt gctccttttg gcttccttcc aggcgcggcg 10800 

gctgctgcgc tagctttttt ggccactggc cgcgcgcagc gtaagcggtt aggctggaaa 10860 

gcgaaagcat taagtggctc gctccctgta gccggagggt tattttccaa gggttgagtc 10920 

gcgggacccc cggttcgagt ctcggaccgg ccggactgcg gcgaacgggg gtttgcctcc 10980 

ccgtcatgca agaccccgct tgcaaattcc tccggaaaca gggacgagcc ccttttttgc 11040 

ttttcccaga tgcatccggt gctgcggcag atgcgccccc ctcctcagca gcggcaagag 11100 

caagagcagc ggcagacatg cagggcaccc tcccctcctc ctaccgcgtc aggaggggcg 11160 

acatccgcgg ttgacgcggc agcagatggt gattacgaac ccccgcggcg ccgggcccgg 11220 

cactacctgg acttggagga gggcgagggc ctggcgcggc taggagcgcc ctctcctgag 11280 

cggtacccaa gggtgcagct gaagcgtgat acgcgtgagg cgtacgtgcc gcggcagaac 11340 

ctgtttcgcg accgcgaggg agaggagccc gaggagatgc gggatcgaaa gttccacgca 11400 

gggcgcgagc tgcggcatgg cctgaatcgc gagcggttgc tgcgcgagga ggactttgag 11460 

cccgacgcgc gaaccgggat tagtcccgcg cgcgcacacg tggcggccgc cgacctggta 11520 

accgcatacg agcagacggt gaaccaggag attaactttc aaaaaagctt taacaaccac 11580 

gtgcgtacgc ttgtggcgcg cgaggaggtg gctataggac tgatgcatct gtgggacttt 11640 
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gtaagcgcgc tggagcaaaa cccaaatagc aagccgctca tggcgcagct gttccttata 11700 

gtgcagcaca gcagggacaa cgaggcattc agggatgcgc tgctaaacat agtagagccc 11760 

gagggccgct ggctgctcga tttgataaac atcctgcaga gcatagtggt gcaggagcgc 11820 

agcttgagcc tggctgacaa ggtggccgcc atcaactatt ccatgcttag cctgggcaag 11880 

ttttacgccc gcaagatata ccatacccct tacgttccca tagacaagga ggtaaagatc 11940 

gaggggttct acatgcgcat ggcgctgaag gtgcttacct tgagcgacga cctgggcgtt 12000 

tatcgcaacg agcgcatcca caaggccgtg agcgtgagcc ggcggcgcga gctcagcgac 12060 

cgcgagctga tgcacagcct gcaaagggcc ctggctggca cgggcagcgg cgatagagag 12120 

gccgagtcct actttgacgc gggcgctgac ctgcgctggg ccccaagccg acgcgccctg 12180 

gaggcagctg gggccggacc tgggctggcg gtggcacccg cgcgcgctgg caacgtcggc 12240 

ggcgtggagg aatatgacga ggacgatgag tacgagccag aggacggcga gtactaagcg 12300 

gtgatgtttc tgatcagatg atgcaagacg caacggaccc ggcggtgcgg gcggcgctgc 12360 

agagccagcc gtccggcctt aactccacgg acgactggcg ccaggtcatg gaccgcatca 1242 0 

tgtcgctgac tgcgcgcaat cctgacgcgt tccggcagca gccgcaggcc aaccggctct 12480 

ccgcaattct ggaagcggtg gtcccggcgc gcgcaaaccc cacgcacgag aaggtgctgg 12540 

cgatcgtaaa cgcgctggcc gaaaacaggg ccatccggcc cgacgaggcc ggcctggtct 12600 

acgacgcgct gcttcagcgc gtggctcgtt acaacagcgg caacgtgcag accaacctgg 12660 

accggctggt gggggatgtg cgcgaggccg tggcgcagcg tgagcgcgcg cagcagcagg 12720 

gcaacctggg ctccatggtt gcactaaacg ccttcctgag tacacagccc gccaacgtgc 12780 

cgcggggaca ggaggactac accaactttg tgagcgcact gcggctaatg gtgactgaga 12840 

caccgcaaag tgaggtgtac cagtctgggc cagactattt tttccagacc agtagacaag 12900 

gcctgcagac cgtaaacctg agccaggctt tcaaaaactt gcaggggctg tggggggtgc 12960 

gggctcccac aggcgaccgc gcgaccgtgt ctagcttgct gacgcccaac tcgcgcctgt 13020 

tgctgctgct aatagcgccc ttcacggaca gtggcagcgt gtcccgggac acatacctag 13080 

gtcacttgct gacactgtac cgcgaggcca taggtcaggc gcatgtggac gagcatactt 13140 

tccaggagat tacaagtgtc agccgcgcgc tggggcagga ggacacgggc agcctggagg 13200 

caaccctaaa ctacctgctg accaaccggc ggcagaagat cccctcgttg cacagtttaa 13260 

acagcgagga ggagcgcatt ttgcgctacg tgcagcagag cgtgagcctt aacctgatgc 13320 

gcgacggggt aacgcccagc gtggcgctgg acatgaccgc gcgcaacatg gaaccgggca 13380 

tgtatgcctc aaaccggccg tttatcaacc gcctaatgga ctacttgcat cgcgcggccg 13440 

ccgtgaaccc cgagtatttc accaatgcca tcttgaaccc gcactggcta ccgccccctg 13500 

gtttctacac cgggggattc gaggtgcccg agggtaacga tggattcctc tgggacgaca 13560 

tagacgacag cgtgttttcc ccgcaaccgc agaccctgct agagttgcaa cagcgcgagc 13620 

aggcagaggc ggcgctgcga aaggaaagct tccgcaggcc aagcagcttg tccgatctag 13680 

gcgctgcggc cccgcggtca gatgctagta gcccatttcc aagcttgata gggtctctta 13740 

ccagcactcg caccacccgc ccgcgcctgc tgggcgagga ggagtaccta aacaactcgc 13800 

tgctgcagcc gcagcgcgaa aaaaacctgc ctccggcatt tcccaacaac gggatagaga 13860 

gcctagtgga caagatgagt agatggaaga cgtacgcgca ggagcacagg gacgtgccag 13920 

gcccgcgccc gcccacccgt cgtcaaaggc acgaccgtca gcggggtctg gtgtgggagg 13980 

acgatgactc ggcagacgac agcagcgtcc tggatttggg agggagtggc aacccgtttg 14040 

cgcaccttcg ccccaggctg gggagaatgt tttaaaaaaa aaaaagcatg atgcaaaata 14100 

aaaaactcac caaggccatg gcaccgagcg ttggttttct tgtattcccc ttagtatgcg 14160 

gcgcgcggcg atgtatgagg aaggtcctcc tccctcctac gagagtgtgg tgagcgcggc 14220 

gccagtggcg gcggcgctgg gttctccctt cgatgctccc ctggacccgc cgtttgtgcc 14280 

tccgcggtac ctgcggccta ccggggggag aaacagcatc cgttactctg agttggcacc 14340 

cctattcgac accacccgtg tgtacctggt ggacaacaag tcaacggatg tggcatccct 14400 

gaactaccag aacgaccaca gcaactttct gaccacggtc attcaaaaca atgactacag 14460 

cccgggggag gcaagcacac agaccatcaa tcttgacgac cggtcgcact ggggcggcga 14520 

cctgaaaacc atcctgcata ccaacatgcc aaatgtgaac gagttcatgt ttaccaataa 14580 

gtttaaggcg cgggtgatgg tgtcgcgctt gcctactaag gacaatcagg tggagctgaa 14640 

atacgagtgg gtggagttca cgctgcccga gggcaactac tccgagacca tgaccataga 14700 

ccttatgaac aacgcgatcg tggagcacta cttgaaagtg ggcagacaga acggggttct 14760 

ggaaagcgac atcggggtaa agtttgacac ccgcaacttc agactggggt ttgaccccgt 14820 

cactggtctt gtcatgcctg gggtatatac aaacgaagcc ttccatccag acatcatttt 14880 

gctgccagga tgcggggtgg acttcaccca cagccgcctg agcaacttgt tgggcatccg 14940 
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caagcggcaa 
cattcccgca 
gggcgggggt 
cgcggcagcc 
cacctttgcc 
cgcccccgct 
gacagaggac 
gtaccgcagc 
gaccctgctt 
agacatgatg 
ggtgggcgcc 
ctcccaactc 
ccagattttg 
tctcacagat 
cattactgac 
gccgcgcgtc 
caataacaca 
ctccgaccaa 
acgcggccgc 
gcgcaactac 
ggtgcgcgga 
ccaccgccgc 
acgtcgcacc 
cactgtgccc 
tatgactcag 
cgtgcccgtg 
gtactgttgt 
caaagaagag 
gcaggattac 
tgaacttgac 
gaaaggtcga 
tgagcgctcc 
gcttgagcag 
gctggcgttg 
gcaggtgctg 
tgacttggca 
ggaaaaaatg 
ggtggcgccg 
cagtattgcc 
ggcggatgcc 
aacggacccg 
cggcgccgcc 
cggctatcgt 
cactggaacc 
cagggtggct 
catcgtttaa 
tttcccggtg 
cctgacgggc 
gcgcggcggt 
cggaattgca 
gaaaaatcaa 
gaatggaaga 
gaaactggca 
tgtggagcgg 
acagcagcac 



cccttccagg 
ctgttggatg 
ggcgcaggcg 
gcggcaatgc 
acacgggctg 
gcgcaacccg 
agcaagaaac 
tggtaccttg 
tgcactcctg 
caagaccccg 
gagctgttgc 
atccgccagt 
gcgcgcccgc 
cacgggacgc 
gccagacgcc 
ctatcgagcc 
ggctggggcc 
cacccagtgc 
actgggcgca 
acgcccacgc 
gcccggcgct 
cgacccggca 
ggccgacggg 
cccaggtcca 
ggtcgcaggg 
cgcacccgcc 
atgtatccag 
atgctccagg 
aagccccgaa 
gacgaggtgg 
cgcgtaaaac 
acccgcacct 
gccaacgagc 
ccgctggacg 
cccgcgcttg 
cccaccgtgc 
accgtggaac 
ggactgggcg 
accgccacag 
gcggtgcagg 
tggatgtttc 
agcgcgctac 
ggctacacct 
cgccgccgcc 
cgcgaaggag 
aagccggtct 
ccgggattcc 
ggcatgcgtc 
atcctgcccc 
tccgtggcct 
aataaaaagt 
catcaacttt 
agatatcggc 
cattaaaaat 
aggccagatg 



agggctttag 
tggacgccta 
gcagcaacag 
agccggtgga 
aggagaagcg 
aggtcgagaa 
gcagttacaa 
catacaacta 
acgtaacctg 
tgaccttccg 
ccgtgcactc 
ttacctctct 
cagcccccac 
taccgctgcg 
gcacctgccc 
gcactttttg 
tgcgcttccc 
gcgtgcgcgg 
ccaccgtcga 
cgccaccagt 
atgctaaaat 
ctgccgccca 
cggccatgcg 
ggcgacgagc 
gcaacgtgta 
ccccgcgcaa 
cggcggcggc 
tcatcgcgcc 
agctaaagcg 
aactgctgca 
gtgttttgcg 
acaagcgcgt 
gcctcgggga 
agggcaaccc 
caccgtccga 
agctgatggt 
ctgggctgga 
tgcagaccgt 
agggcatgga 
cggtcgctgc 
gcgtttcagc 
tgcccgaata 
accgccccag 
gtcgccgtcg 
gcaggaccct 
ttgtggttct 
gaggaagaat 
gtgcgcacca 
tccttattcc 
tgcaggcgca 
ctggactctc 
gcgtctctgg 
accagcaata 
ttcggttcca 
ctgagggata 



gatcacctac 

ccaggcgagc 

cagtggcagc 

ggacatgaac 

cgctgaggcc 

gcctcagaag 

cctaataagc 

cggcgaccct 

cggctcggag 

ctccacgcgc 

caagagcttc 

gacccacgtg 

catcaccacc 

caacagcatc 

ctacgtttac 

agcaagcatg 

aagcaagatg 

gcactaccgc 

tgacgccatc 

gtccacagtg 

gaagagacgg 

acgcgcggcg 

ggccgctcga 

ggccgccgca 

ttgggtgcgc 

ctagattgca 

gcgcaacgaa 

ggagatctat 

ggtcaaaaag 

cgctaccgcg 

acccggcacc 

gtatgatgag 

gtttgcctac 

aacacctagc 

agaaaagcgc 

acccaagcgc 

gcccgaggtc 

ggacgttcag 

gacacaaacg 

ggccgcgtcc 

cccccggcgc 

tgccctacat 

aagacgagca 

ccagcccgtg 

ggtgctgcca 

tgcagatatg 

gcaccgtagg 

ccggcggcgg 

actgatcgcc 

gagacactga 

acgctcgctt 

ccccgcgaca 

tgagcggtgg 

ccgttaagaa 

agttgaaaga 



gatgatctgg 
ttgaaagatg 
ggcgcggaag 
gatcatgcca 
gaagcagcgg 
aaaccggtga 
aatgacagca 
cagaccggaa 
caggtctact 
cagatcagca 
tacaacgacc 
ttcaatcgct 
gtcagtgaaa 
ggaggagtcc 
aaggccctgg 
tccatcctta 
tttggcgggg 
gcgccctggg 
gacgcggtgg 
gacgcggcca 
cggaggcgcg 
gcggccctgc 
aggctggccg 
gcagccgcgg 
gactcggtta 
agaaaaaact 
gctatgtcca 
ggccccccga 
aaaaagaaag 
cccaggcgac 
accgtagtct 
gtgtacggcg 
ggaaagcggc 
ctaaagcccg 
ggcctaaagc 
cagcgactgg 
cgcgtgcggc 
atacccacta 
tccccggttg 
aagacctcta 
ccgcgcggtt 
ccttccattg 
actacccgac 
ctggccccga 
acagcgcgct 
gccctcacct 
aggggcatgg 
cgcgcgtcgc 
gcggcgattg 
ttaaaaacaa 
ggtcctgtaa 
cggctcgcgc 
cgccttcagc 
ctatggcagc 
gcaaaatttc 



agggtggtaa 


15000 


acaccgaaca 


15060 


agaactccaa 


15120 


ttcgcggcga 


15180 


ccgaagctgc 


15240 


tcaaacccct 


15300 


ccttcaccca 


15360 


tccgctcatg 


15420 


ggtcgttgcc 


15480 


actttccggt 


15540 


aggccgtcta 


15600 


ttcccgagaa 


15660 


acgttcctgc 


15720 


agcgagtgac 


15780 


gcatagtctc 


15840 


tatcgcccag 


15900 


ccaagaagcg 


15960 


gcgcgcacaa 


16020 


tggaggaggc 


16080 


ttcagaccgt 


16140 


tagcacgtcg 


16200 


ttaaccgcgc 


16260 


cgggtattgt 


16320 


ccattagtgc 


16380 


gcggcctgcg 


16440 


acttagactc 


16500 


agcgcaaaat 


16560 


agaaggaaga 


16620 


atgatgatga 


16680 


gggtacagtg 


16740 


ttacgcccgg 


16800 


acgaggacct 


16860 


ataaggacat 


16920 


taacactgca 


16980 


gcgagtctgg 


17040 


aagatgtctt 


17100 


caatcaagca 


17160 


ccagtagcac 


17220 


cctcagcggt 


17280 


cggaggtgca 


17340 


cgaggaagta 


17400 


cgcctacccc 


17460 


gccgaaccac 


17520 


tttccgtgcg 


17580 


accaccccag 


17640 


gccgcctccg 


17700 


ccggccacgg 


17760 


accgtcgcat 


17820 


gcgccgtgcc 


17880 


gttgcatgtg 


17940 


ctattttgta 


18000 


ccgttcatgg 


18060 


tggggctcgc 


18120 


aaggcctgga 


18180 


caacaaaagg 


18240 
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tggtagatgg 
aaaataagat 
tggagacagt 
ctctggtgac 
ccaccacccg 
cgctggacct 
ccgttgttgt 
cgttgcggcc 
gggtgcaatc 
atgtatgcgt 
gatggctacc 
ctcggagtac 
cctgaataac 
gtcccagcgt 
caaggcgcgg 
ctttgacatc 
ctacaacgcc 
tgctcttgaa 
agctgagcag 
aaaggagggt 
tcaacctgaa 
tgggagagtc 
cacaaatgaa 
tcaagtggaa 
gactcctaaa 
ttcttacatg 
gcccaacagg 
cagcacgggt 
tttgcaagac 
aaccaggtac 
tattgaaaat 
gattaataca 
aaaagatgct 
ggaaatcaat 
tttgcccgac 
ctacgactac 
tggagcacgc 
tgctggcctg 
ccaggtgcct 
ctacgagtgg 
cctaagggtt 
ccccatggcc 
ccagtccttt 
taccaacgtg 
cacgcgcctt 
ctactctggc 
ggtggccatt 
caacgagttt 
catgaccaaa 
cttctatatc 
catgagccgt 
acaccaacac 
ggcctaccct 
ccagaaaaag 
gtccatgggc 



cctggcctct 
taacagtaag 
gtctccagag 
gcaaatagac 
tcccatcgcg 
gcctcccccc 
aacccgtcct 
cgtagccagt 
cctgaagcgc 
ccatgtcgcc 
ccttcgatga 
ctgagccccg 
aagtttagaa 
ttgacgctgc 
ttcaccctag 
cgcggcgtgc 
ctggctccca 
ataaacctag 
caaaaaactc 
attcaaatag 
cctcaaatag 
cttaaaaaga 
aatggagggc 
atgcaatttt 
gtggtattgt 
cccactatta 
cctaattaca 
aatatgggtg 
agaaacacag 
ttttctatgt 
catggaactg 
gagactctta 
acagaatttt 
ctaaatgcca 
aagctaaagt 
atgaacaaigc 
tggtcccttg 
cgctaccgct 
cagaagttct 
aacttcagga 
gacggagcca 
cacaacaccg 
aacgactatc 
cccatatcca 
aagactaagg 
tctataccct 
acctttgact 
gaaattaagc 
gactggttcc 
ccagagagct 
caggtggtgg 
aacaactctg 
gctaacttcc 
tttctttgcg 
gcactcacag 



ggcattagcg 
cttgatcccc 
gggcgtggcg 
gagcctccct 
cccatggcta 
gccgacaccc 
agccgcgcgt 
ggcaactggc 
cgacgatgct 
gccagaggag 
tgccgcagtg 
ggctggtgca 
accccacggt 
ggttcatccc 
ctgtgggtga 
tggacagggg 
agggtgcccc 
aagaagagga 
acgtatttgg 
gtgtcgaagg 
gagaatctca 
ctaccccaat 
aaggcattct 
tctcaactac 
acagtgaaga 
aggaaggtaa 
ttgcttttag 
ttctggcggg 
agctttcata 
ggaatcaggc 
aagatgaact 
ccaaggtaaa 
cagataaaaa 
acctgtggag 
acagtccttc 
gagtggtggc 
actatatgga 
caatgttgct 
ttgccattaa 
aggatgttaa 
gcattaagtt 
cctccacgct 
tctccgccgc 
tcccctcccg 
aaaccccatc 
acctagatgg 
cttctgtcag 
gctcagttga 
tggtacaaat 
acaaggaccg 
atgatactaa 
gatttgttgg 
cctatccgct 
atcgcaccct 
acctgggcca 



gggtggtgga 
gccctcccgt 
aaaagcgtcc 
cgtacgagga 
ccggagtgct 
agcagaaacc 
ccctgcgccg 
aaagcacact 
tctgaatagc 
ctgctgagcc 
gtcttacatg 
gtttgcccgc 
ggcgcctacg 
tgtggaccgt 
taaccgtgtg 
ccctactttt 
aaatccttgc 
cgatgacaac 
gcaggcgcct 
tcaaacacct 
gtggtacgaa 
gaaaccatgt 
tgtaaagcaa 
tgaggcgacc 
tgtagatata 
ctcacgagaa 
ggacaatttt 
ccaagcatcg 
ccagcttttg 
tgttgacagc 
tccaaattac 
acctaaaaca 
tgaaataaga 
aaatttcctg 
caacgtaaaa 
tcccgggtta 
caacgtcaac 
gggcaatggt 
aaacctcctt 
catggttctg 
tgatagcatt 
tgaggccatg 
caacatgctc 
caactgggcg 
actgggctcg 
aaccttttac 
ctggcctggc 
cggggagggt 
gctagctaac 
catgtactcc 
atacaaggac 
ctaccttgcc 
tataggcaag 
ttggcgcatc 
aaaccttctc 



cctggccaac 
agaggagcct 
gcgccccgac 
ggcactaaag 
gggccagcac 
tgtgctgcca 
cgccgccagc 
gaacagcatc 
taacgtgtcg 
gccgcgcgcc 
cacatctcgg 
gccaccgaga 
cacgacgtga 
gaggatactg 
ctggacatgg 
aagccctact 
gaatgggatg 
gaagacgaag 
tattctggta 
aaatatgccg 
actgaaatta 
tacggttcat 
caaaatggaa 
gcaggcaatg 
gaaaccccag 
ctaatgggcc 
attggtctaa 
cagttgaatg 
cttgattcca 
tatgatccag 
tgctttccac 
ggtcaggaaa 
gttggaaata 
tactccaaca 
atttctgata 
gtggactgct 
ccatttaacc 
cgctatgtgc 
ctcctgccgg 
cagagctccc 
tgcctttacg 
cttagaaacg 
taccctatac 
gctttccgcg 
ggctacgacc 
ctcaaccaca 
aatgaccgcc 
tacaacgttg 
tacaacattg 
ttctttagaa 
taccaacagg 
cccaccatgc 
accgcagttg 
ccattctcca 
tacgccaact 



caggcagtgc 
ccaccggccg 
agggaagaaa 
caaggcctgc 
acacccgtaa 
ggcccgaccg 
ggtccgcgat 
gtgggtctgg 
tatgtgtgtc 
cgctttccaa 
gccaggacgc 
cgtacctcag 
ccacagaccg 
cgtactcgta 
cttccacgta 
ctggcactgc 
aagctgctac 
tagacgagca 
taaatattac 
ataaaacatt 
atcatgcagc 
atgcaaaacc 
agctagaaag 
gtgataactt 
acactcatat 
aacaatctat 
tgtattacaa 
ctgttgtaga 
ttggtgatag 
atgttagaat 
tgggaggtgt 
atggatggga 
attttgccat 
tagcgctgta 
acccaaacac 
acattaacct 
accaccgcaa 
ccttccacat 
gctcatacac 
taggaaatga 
ccaccttctt 
acaccaacga 
ccgccaacgc 
gctgggcctt 
cttattacac 
cctttaagaa 
tgcttacccc 
cccagtgtaa 
gctaccaggg 
acttccagcc 
tgggcatcct 
gcgaaggaca 
acagcattac 
gtaactttat 
ccgcccacgc 



18300 
18360 
18420 
18480 
18540 
18600 
18660 
18720 
18780 
18840 
18900 
18960 
19020 
19080 
19140 
19200 
19260 
19320 
19380 
19440 
19500 
19560 
19620 
19680 
19740 
19800 
19860 
19920 
19980 
20040 
20100 
20160 
20220 
20280 
20340 
20400 
20460 
20520 
20580 
20640 
20700 
20760 
20820 
20880 
20940 
21000 
21060 
21120 
21180 
21240 
21300 
21360 
21420 
21480 
21540 



54/64 



WO 03/031588 



PCTYUS02/32512 



gctagacatg 
tgaagtcttt 
cctgcgcacg 
acagctgccg 
tgtgggccat 
aagctcgcct 
gcctttgcct 
gaccagcgac 
attgcttctt 
cccaactcgg 
ccccaaactc 
atgctcaaca 
ttcctggagc 
tctttttgtc 
aaatgctttt 
gtttaaaaat 
cgatactggt 
aagttttcac 
atcttgaagt 
cagcactgga 
atcagatccg 
tgccttccca 
aaaaggtgac 
tgcttaaaag 
gaaaactgat 
atctgcacca 
ttcagcgcgc 
atcataatgc 
cacaacgcgc 
tacgcctgca 
tgcaacccgc 
tggtcaggca 
agcgcgcgcg 
ttcatcaccg 
ataccacgcg 
ccatgcttga 
ctttcttcct 
gggcgcttct 
gggctgggtg 
atacgccgcc 
gacgacacgt 
gtttcgcgct 
atggagtcag 
tccaccgatg 
gaggaagtga 
gtaccaacag 
gggcgggggg 
catctgcagc 
ctcgccatag 
cccaaacgcc 
tttgccgtgc 
ctatcctgcc 
gtcatacctg 
gacgagaagc 
ggagtgttgg 



acttttgagg 
gacgtggtcc 
cccttctcgg 
ccatgggctc 
attttttggg 
gcgccatagt 
ggaacccgca 
tcaagcaggt 
cccccgaccg 
ccgcctgtgg 
ccatggatca 
gtccccaggt 
gccactcgcc 
acttgaaaaa 
atttgtacac 
caaaggggtt 
gtttagtgct 
tccacaggct 
cgcagttggg 
acactatcag 
cgtccaggtc 
aaaagggcgc 
cgtgcccggt 
ccacctgagc 
tggccggaca 
catttcggcc 
gctgcccgtt 
ttccgtgtag 
agcccgtggg 
ggaatcgccc 
ggtgctcctc 
gtagtttgaa 
cagcctccat 
taatttcact 
ccactgggtc 
ttagcaccgg 
cgctgtccac 
ttttcttctt 
tgcgcggcac 
tcatccgctt 
cctccatggt 
gctcctcttc 
tcgagaagaa 
ccgccaacgc 
ttatcgagca 
aggataaaaa 
acgaaaggca 
gccagtgcgc 
cggatgtcag 
aagaaaacgg 
cagaggtgct 
gtgccaaccg 
atatcgcctc 
gcgcggcaaa 
tggaactcga 



tggatcccat 
gtgtgcaccg 
ccggcaacgc 
cagtgagcag 
cacctatgac 
caatacggcc 
ctcaaaaaca 
ttaccagttt 
ctgtataacg 
actattctgc 
caaccccacc 
acagcccacc 
ctacttccgc 
catgtaaaaa 
tctcgggtga 
ctgccgcgca 
ccacttaaac 
gcgcaccatc 
gcctccgccc 
cgccgggtgg 
ctccgcgttg 
gtgcccaggc 
ctgggcgtta 
ctttgcgcct 
ggccgcgtcg 
ccaccggttc 
ttcgctcgtc 
acacttaagc 
ctcgtgatgc 
catcatcgtc 
gttcagccag 
gttcgccttt 
gcccttctcc 
ttccgcttcg 
gtcttcattc 
tgggttgctg 
gattacctct 
gggcgcaatg 
cagcgcgtct 
ttttgggggc 
tgggggacgt 
ccgactggcc 
ggacagccta 
gcctaccacc 
ggacccaggt 
gcaagaccag 
tggcgactac 
cattatctgc 
ccttgcctac 
cacatgcgag 
tgccacctat 
cagccgagcg 
gctcaacgaa 
cgctctgcaa 
gggtgacaac 



ggacgagccc 
gccgcaccgc 
cacaacataa 
gaactgaaag 
aagcgctttc 
ggtcgcgaga 
tgctacctct 
gagtacgagt 
ctggaaaagt 
tgcatgtttc 
atgaacctta 
ctgcgtcgca 
agccacagtg 
taatgtacta 
ttatttaccc 
tcgctatgcg 
tcaggcacaa 
accaacgcgt 
tgcgcgcgcg 
tgcacgctgg 
ctcagggcga 
tttgagttgc 
ggatacagcg 
tcagagaaga 
tgcacgcagc 
ttcacgatct 
acatccattt 
tcgccttcga 
ttgtaggtca 
acaaaggtct 
gtcttgcata 
agatcgttat 
cacgcagaca 
ctgggctctt 
agccgccgca 
aaacccacca 
ggtgatggcg 
gccaaatccg 
tgtgatgagt 
gcccggggag 
cgcgccgcac 
atttccttct 
accgccccct 
ttccccgtcg 
tttgtaagcg 
gacaacgcag 
ctagatgtgg 
gacgcgttgc 
gaacgccacc 
cccaacccgc 
cacatctttt 
gacaagcagc 
gtgccaaaaa 
caggaaaaca 
gcgcgcctag 



acccttcttt 
ggcgtcatcg 
agaagcaagc 
ccattgtcaa 
caggctttgt 
ctgggggcgt 
ttgagccctt 
cactcctgcg 
ccacccaaag 
tccacgcctt 
ttaccggggt 
accaggaaca 
cgcagattag 
gagacacttt 
ccacccttgc 
ccactggcag 
ccatccgcgg 
ttagcaggtc 
agttgcgata 
ccagcacgct 
acggagtcaa 
actcgcaccg 
cctgcataaa 
acatgccgca 
accttgcgtc 
tggccttgct 
caatcacgtg 
tctcagcgca 
cctctgcaaa 
tgttgctggt 
cggccgccag 
ccacgtggta 
cgatcggcac 
cctcttcctc 
ctgtgcgctt 
tttgtagcgc 
ggcgctcggg 
ccgccgaggt 
cttcctcgtc 
gcggcggcga 
cgcgtccgcg 
cctataggca 
ctgagttcgc 
aggcaccccc 
aagacgacga 
aggcaaacga 
gagacgacgt 
aagagcgcag 
tattctcacc 
gcctcaactt 
tccaaaactg 
tggccttgcg 
tctttgaggg 
gcgaaaatga 
ccgtactaaa 



atgttttgtt 
aaaccgtgta 
aacatcaaca 
agatcttggt 
ttctccacac 
acactggatg 
tggcttttct 
ccgtagcgcc 
cgtacagggg 
tgccaactgg 
acccaactcc 
gctctacagc 
gagcgccact 
caataaaggc 
cgtctgcgcc 
ggacacgttg 
cagctcggtg 
gggcgccgat 
cacagggttg 
cttgtcggag 
ctttggtagc 
tagtggcatc 
agccttgatc 
agacttgccg 
ggtgttggag 
agactgctcc 
ctccttattt 
gcggtgcagc 
cgactgcagg 
gaaggtcagc 
agcttccact 
cttgtccatc 
actcagcggg 
ttgcgtccgc 
acctcctttg 
cacatcttct 
cttgggagaa 
cgatggccgc 
ctcggactcg 
cggggacggg 
ctcgggggtg 
gaaaaagatc 
caccaccgcc 
gcttgaggag 
ggaccgctca 
ggaacaagtc 
gctgttgaag 
cgatgtgccc 
gcgcgtaccc 
ctaccccgta 
caagataccc 
gcagggcgct 
tcttggacgc 
aagtcactct 
acgcagcatc 



21600 
21660 
21720 
21780 
21840 
21900 
21960 
22020 
22080 
22140 
22200 
22260 
22320 
22380 
22440 
22500 
22560 
22620 
22680 
22740 
22800 
22860 
22920 
22980 
23040 
23100 
23160 
23220 
23280 
23340 
23400 
23460 
23520 
23580 
23640 
23700 
23760 
23820 
23880 
23940 
24000 
24060 
24120 
24180 
24240 
24300 
24360 
24420 
24480 
24540 
24600 
24660 
24720 
24780 
24840 
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gaggtcaccc actttgccta cccggcactt aacctacccc ccaaggtcat gagcacagtc 24900 

atgagtgagc tgatcgtgcg ccgtgcgcag cccctggaga gggatgcaaa tttgcaagaa 24960 

caaacagagg agggcctacc cgcagttggc gacgagcagc tagcgcgctg gcttcaaacg 25020 

cgcgagcctg ccgacttgga ggagcgacgc aaactaatga tggccgcagt gctcgttacc 25080 

gtggagcttg agtgcatgca gcggttcttt gctgacccgg agatgcagcg caagctagag 25140 

gaaacattgc actacacctt tcgacagggc tacgtacgcc aggcctgcaa gatctccaac 25200 

gtggagctct gcaacctggt ctcctacctt ggaattttgc acgaaaaccg ccttgggcaa 2 5260 

aacgtgcttc attccacgct caagggcgag gcgcgccgcg actacgtccg cgactgcgtt 25320 

tacttatttc tatgctacac ctggcagacg gccatgggcg tttggcagca gtgcttggag 25380 

gagtgcaacc tcaaggagct gcagaaactg ctaaagcaaa acttgaagga cctatggacg 25440 

gccttcaacg agcgctccgt ggccgcgcac ctggcggaca tcattttccc cgaacgcctg 25500 

cttaaaaccc tgcaacaggg tctgccagac ttcaccagtc aaagcatgtt gcagaacttt 25560 

aggaacttta tcctagagcg ctcaggaatc ttgcccgcca cctgctgtgc acttcctagc 25620 

gactttgtgc ccattaagta ccgcgaatgc cctccgccgc tttggggcca ctgctacctt 25680 

ctgcagctag ccaactacct tgcctaccac tctgacataa tggaagacgt gagcggtgac 25740 

ggtctactgg agtgtcactg tcgctgcaac ctatgcaccc cgcaccgctc cctggtttgc 25800 

aattcgcagc tgcttaacga aagtcaaatt atcggtacct ttgagctgca gggtccctcg 25860 

cctgacgaaa agtccgcggc tccggggttg aaactcactc cggggctgtg gacgtcggct 25920 

taccttcgca aatttgtacc tgaggactac cacgcccacg agattaggtt ctacgaagac 25980 

caatcccgcc cgccaaatgc ggagcttacc gcctgcgtca ttacccaggg ccacattctt 26040 

ggccaattgc aagccatcaa caaagcccgc caagagtttc tgctacgaaa gggacggggg 26100 

gtttacttgg acccccagtc cggcgaggag ctcaacccaa tccccccgcc gccgcagccc 26160 

tatcagcagc agccgcgggc ccttgcttcc caggatggca cccaaaaaga agctgcagct 26220 

gccgccgcca cccacggacg aggaggaata ctgggacagt caggcagagg aggttttgga 26280 

cgaggaggag gaggacatga tggaagactg ggagagccta gacgaggaag cttccgaggt 26340 

cgaagaggtg tcagacgaaa caccgtcacc ctcggtcgca ttcccctcgc cggcgcccca 26400 

gaaatcggca accggttcca gcatggctac aacctccgct cctcaggcgc cgccggcact 26460 

gcccgttcgc cgacccaacc gtagatggga caccactgga accagggccg gtaagtccaa 2 6520 

gcagccgccg ccgttagccc aagagcaaca acagcgccaa ggctaccgct catggcgcgg 26580 

gcacaagaac gccatagttg cttgcttgca agactgtggg ggcaacatct ccttcgcccg 26640 

ccgctttctt ctctaccatc acggcgtggc cttcccccgt aacatcctgc attactaccg 2 6700 

tcatctctac agcccatact gcaccggcgg cagcggcagc ggcagcaaca gcagcggcca 26760 

cacagaagca aaggcgaccg gatagcaaga ctctgacaaa gcccaagaaa tccacagcgg 26820 

cggcagcagc aggaggagga gcgctgcgtc tggcgcccaa cgaacccgta tcgacccgcg 26880 

agcttagaaa caggattttt cccactctgt atgctatatt tcaacagagc aggggccaag 26940 

aacaagagct gaaaataaaa aacaggtctc tgcgatccct cacccgcagc tgcctgtatc 27000 

acaaaagcga agatcagctt cggcgcacgc tggaagacgc ggaggctctc ttcagtaaat 27060 

actgcgcgct gactcttaag gactagtttc gcgccctttc tcaaatttaa gcgcgaaaac 27120 

tacgtcatct ccagcggcca cacccggcgc cagcacctgt cgtcagcgcc attatgagca 27180 

aggaaattcc cacgccctac atgtggagtt accagccaca aatgggactt gcggctggag 27240 

ctgcccaaga ctactcaacc cgaataaact acatgagcgc gggaccccac atgatatccc 27300 

gggtcaacgg aatccgcgcc caccgaaacc gaattctctt ggaacaggcg gctattacca 27360 

ccacacctcg taataacctt aatccccgta gttggcccgc tgccctggtg taccaggaaa 27420 

gtcccgctcc caccactgtg gtacttccca gagacgccca ggccgaagtt cagatgacta 27480 

actcaggggc gcagcttgcg ggcggctttc gtcacagggt gcggtcgccc gggcagggta 27540 

taactcacct gacaatcaga gggcgaggta ttcagctcaa cgacgagtcg gtgagctcct 27600 

cgcttggtct ccgtccggac gggacatttc agatcggcgg cgccggccgt ccttcattca 27660 

cgcctcgtca ggcaatccta actctgeaga cctcgtcctc tgagccgcgc tctggaggca 27720 

ttggaactct gcaatttatt gaggagtttg tgccatcggt ctactttaac cccttctcgg 27780 

gacctcccgg ccactatccg gatcaattta ttcctaactt tgacgcggta aaggactegg 27840 

eggaeggcta cgactgaatg ttaagtggag aggcagagca actgcgcctg aaacacctgg 27900 

tccactgtcg ccgccacaag tgctttgccc gcgactccgg tgagttttgc tactttgaat 27960 

tgcccgagga tcatatcgag ggcccggcgc acggcgtccg gcttaccgcc cagggagagc 28020 

ttgecegtag ectgattegg gagtttaccc agcgccccct gctagttgag egggacaggg 28080 

gaccctgtgt tctcactgtg atttgeaact gtcctaacct tggattacat caagatcttt 28140 
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gttgccatct ctgtgctgag tataataaat 
cgccatcctg taaacgccac cgtcttcacc 
ggtactttta acatctctcc ctctgtgatt 
ctacgagaga acctctccga gctcagctac 
tgccgggaac gtacgagtgc gtcaccggcc 
ccagactttt tccggacaga cctcaataac 
aaaaccctta gggtattagg ccaaaggcgc 
caactctacg ggctattcta attcaggttt 
tcttgtgatt ctctttattc ttatactaac 
tgtgcacatt tgcatttatt gtcagctttt 
aggtacataa tcctaggttt actcaccctt 
gattttaagg agccagcctg taatgttaca 
cttataaaat gcaccacaga acatgaaaag 
aagtatgctg tttatgctat ttggcagcca 
ttccagggta aaagtcataa aacttttatg 
attaccatgt acatgagcaa acagtataag 
actggcactt tctgctgcac tgctatgcta 
ctctatatta aatacaaaag cagacgcagc 
actaagttac aaagctaatg tcaccactaa 
caaaaagtta gcattataat tagaatagga 
accattcccc tgaacaattg actctatgtg 
caggcttcct ggatgtcagc atctgacttt 
gtccaactac agcgacccac cctaacagag 
accggactta catctaccac aaatacaccc 
aacttgggca tgtggtggtt ctccatagcg 
ctcatctgct gcctaaagcg caaacgcgcc 
ctacacccaa acaatgatgg aatccataga 
cttacagtat gattaaatga gacatgattc 
cgcttttttg tgcgtgctcc acattggctg 
cagccttcac agtctatttg ctttacggat 
tcactgtggt catcgccttt atccagtgca 
tcagacacca tccccagtac agggacagga 
tatgaaattt actgtgactt ttctgctgat 
gacctccaag cctcaaagac atatatcatg 
ttgctacaat gaaaaaagcg atctttccga 
ggtgttctgc agtaccatct tagccctagc 
acgaatagat gccatgaacc acccaacttt 
agttgttgcc ggcggctttg tcccagccaa 
tgaaatcagc tactttaatc taacaggagg 
acggaattat tacagagcag cgcctgctag 
gcatgaatca agagctccaa gacatggtta 
gtctggtaaa gcaggccaaa gtcacctacg 
acaagttgcc aaccaagcgt cagaaattgg 
taactcagca ctcggtagaa accgaaggct 
atctctgcac ccttattaag accctgtgcg 
aaaaaaaaat aataaagcat cacttactta 
ttcagcagca cctccttgcc ctcctcccag 
aactttctcc acaatctaaa tggaatgtca 
actatcttca tgttgttgca gatgaagcgc 
gtgtatccat atgacacgga aaccggtcct 
gtatccccca atgggtttca agagagtccc 
cctctagtta cctccaatgg catgcttgcg 
gaggccggca accttacctc ccaaaatgta 
aagtcaaaca taaacctgga aatatctgca 
gtggctgccg ccgcacctct aatggtcgcg 



acagaaatta aaatatactg gggctcctat 28200 

cgcccaagca aaccaaggcg aaccttacct 28260 

tacaacagtt tcaacccaga cggagtgagt 28320 

tccatcagaa aaaacaccac cctccttacc 28380 

gctgcaccac acctaccgcc tgaccgtaaa 28440 

tctgtttacc agaacaggag gtgagcttag 28500 

agctactgtg gggtttatga acaattcaag 28560 

ctctagaatc ggggttgggg ttattctctg 28620 

gcttctctgc ctaaggctcg ccgcctgctg 2 8680 

taaacgctgg ggtcgccacc caagatgatt 28740 

gcgtcagccc acggtaccac ccaaaaggtg 28800 

ttcgcagctg aagctaatga gtgcaccact 28860 

ctgcttattc gccacaaaaa caaaattggc 28920 

ggtgacacta cagagtataa tgttacagtt 28980 

tatacttttc cattttatga aatgtgcgac 29040 

ttgtggcccc cacaaaattg tgtggaaaac 29100 

attacagtgc tcgctttggt ctgtacccta 29160 

tttattgagg aaaagaaaat gccttaattt 29220 

ctgctttact cgctgcttgc aaaacaaatt 29280 

tttaaacccc ccggtcattt cctgctcaat 29340 

ggatatgctc cagcgctaca accttgaagt 29400 

ggccagcacc tgtcccgcgg atttgttcca 29460 

atgaccaaca caaccaacgc ggccgccgct 29520 

caagtttctg cctttgtcaa taactgggat 29580 

cttatgtttg tatgccttat tattatgtgg 29640 

cgaccaccca tctatagtcc catcattgtg 29700 

ttggacggac tgaaacacat gttcttttct 29760 

ctcgagtttt tatattactg acccttgttg 29820 

cggtttctca catcgaagta gactgcattc 29880 

ttgtcaccct cacgctcatc tgcagcctca 29940 

ttgactgggt ctgtgtgcgc tttgcatatc 30000 

ctatagctga gcttcttaga attctttaat 30060 

tatttgcacc ctatctgcgt tttgttcccc 30120 

cagattcact cgtatatgga atattccaag 30180 

agcctggtta tatgcaatca tctctgttat 30240 

tatatatccc taccttgaca ttggctggaa 30300 

ccccgcgccc gctatgcttc cactgcaaca 30360 

tcagcctcgc cccacttctc ccacccccac 30420 

agatgactga caccctagat ctagaaatgg 30480 

aaagacgcag ggcagcggcc gagcaacagc 30540 

acttgcacca gtgcaaaagg ggtatctttt 30600 

acagtaatac caccggacac cgccttagct 30660 

tggtcatggt gggagaaaag cccattacca 30720 

gcattcactc accttgtcaa ggacctgagg 30780 

gtctcaaaga tcttattccc tttaactaat 30840 

aaatcagtta gcaaatttct gtccagttta 30900 

ctctggtatt gcagcttcct cctggctgca 30960 

gtttcctcct gttcctgtcc atccgcaccc 31020 

gcaagaccgt ctgaagatac cttcaacccc 31080 

ccaactgtgc cttttcttac tcctcccttt 31140 

cctggggtac tctctttgcg cctatccgaa 31200 

ctcaaaatgg gcaacggcct ctctctggac 31260 

accactgtga gcccacctct caaaaaaacc 31320 

cccctcacag ttacctcaga agccctaact 31380 

ggcaacacac tcaccatgca atcacaggcc 31440 
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ccgctaaccg tgcacgactc caaacttagc attgccaccc aaggacccct cacagtgtca 31500 

gaaggaaagc tagccctgca aacatcaggc cccctcacca ccaccgatag cagtaccctt 31560 

actatcactg cctcaccccc tctaactact gccactggta gcttgggcat tgacttgaaa 31620 

gagcccattt atacacaaaa tggaaaacta ggactaaagt acggggctcc tttgcatgta 31680 

acagacgacc taaacacttt gaccgtagca actggtccag gtgtgactat taataatact 31740 

tccttgcaaa ctaaagttac tggagccttg ggttttgatt cacaaggcaa tatgcaactt 31800 

aatgtagcag gaggactaag gattgattct caaaacagac gccttatact tgatgttagt 31860 

tatccgtttg atgctcaaaa ccaactaaat ctaagactag gacagggccc tctttttata 31920 

aactcagccc acaacttgga tattaactac aacaaaggcc tttacttgtt tacagcttca 31980 

aacaattcca aaaagcttga ggttaaccta agcactgcca aggggttgat gtttgacgct 32040 

acagccatag ccattaatgc aggagatggg cttgaatttg gttcacctaa tgcaccaaac 32100 

acaaatcccc tcaaaacaaa aattggccat ggcctagaat ttgattcaaa caaggctatg 32160 

gttcctaaac taggaactgg ccttagtttt gacagcacag gtgccattac agtaggaaac 32220 

aaaaataatg ataagctaac tttgtggacc acaccagctc catctcctaa ctgtagacta 32280 

aatgcagaga aagatgctaa actcactttg gtcttaacaa aatgtggcag tcaaatactt 32340 

gctacagttt cagttttggc tgttaaaggc agtttggctc caatatctgg aacagttcaa 32400 

agtgctcatc ttattataag atttgacgaa aatggagtgc tactaaacaa ttccttcctg 32460 

gacccagaat attggaactt tagaaatgga gatcttactg aaggcacagc ctatacaaac 32520 

gctgttggat ttatgcctaa cctatcagct tatccaaaat ctcacggtaa aactgccaaa 32580 

agtaacattg tcagtcaagt ttacttaaac ggagacaaaa ctaaacctgt aacactaacc 32640 

attacactaa acggtacaca ggaaacagga gacacaactc caagtgcata ctctatgtca 32700 

ttttcatggg actggtctgg ccacaactac attaatgaaa tatttgccac atcctcttac 32760 

actttttcat acattgccca agaataaaga atcgtttgtg ttatgtttca acgtgtttat 32820 

ttttcaattg cagaaaattt caagtcattt ttcattcagt agtatagccc caccaccaca 32880 

tagcttatac agatcaccgt accttaatca aactcacaga accctagtat tcaacctgcc 32940 

acctccctcc caacacacag agtacacagt cctttctccc cggctggcct taaaaagcat 33000 

catatcatgg gtaacagaca tattcttagg tgttatattc cacacggttt cctgtcgagc 33060 

caaacgctca tcagtgatat taataaactc cccgggcagc tcacttaagt tcatgtcgct 33120 

gtccagctgc tgagccacag gctgctgtcc aacttgcggt tgcttaacgg gcggcgaagg 33180 

agaagtccac gcctacatgg gggtagagtc ataatcgtgc atcaggatag ggcggtggtg 33240 

ctgcagcagc gcgcgaataa actgctgccg ccgccgctcc gtcctgcagg aatacaacat 33300 

ggcagtggtc tcctcagcga tgattcgcac cgcccgcagc ataaggcgcc ttgtcctccg 33360 

ggcacagcag cgcaccctga tctcacttaa atcagcacag taactgcagc acagcaccac 33420 

aatattgttc aaaatcccac agtgcaaggc gctgtatcca aagctcatgg cggggaccac 33480 

agaacccacg tggccatcat accacaagcg caggtagatt aagtggcgac ccctcataaa 33540 

cacgctggac ataaacatta cctcttttgg catgttgtaa ttcaccacct cccggtacca 33600 

tataaacctc tgattaaaca tggcgccatc caccaccatc ctaaaccagc tggccaaaac 33660 

ctgcccgccg gctatacact gcagggaacc gggactggaa caatgacagt ggagagccca 33720 

ggactcgtaa ccatggatca tcatgctcgt catgatatca atgttggcac aacacaggca 33780 

cacgtgcata cacttcctca ggattacaag ctcctcccgc gttagaacca tatcccaggg 33840 

aacaacccat tcctgaatca gcgtaaatcc cacactgcag ggaagacctc gcacgtaact 33900 

cacgttgtgc attgtcaaag tgttacattc gggcagcagc ggatgatcct ccagtatggt 33960 

agcgcgggtt tctgtctcaa aaggaggtag acgatcccta ctgtacggag tgcgccgaga 34020 

caaccgagat cgtgttggtc gtagtgtcat gccaaatgga acgccggacg tagtcatatt 34080 

tcctgaagca aaaccaggtg cgggcgtgac aaacagatct gcgtctccgg tctcgccgct 34140 

tagatcgctc tgtgtagtag ttgtagtata tccactctct caaagcatcc aggcgccccc 34200 

tggcttcggg ttctatgtaa actccttcat gcgccgctgc cctgataaca tccaccaccg 34260 

cagaataagc cacacccagc caacctacac attcgttctg cgagtcacac acgggaggag 34320 

cgggaagagc tggaagaacc atgttttttt ttttattcca aaagattatc caaaacctca 34380 

aaatgaagat ctattaagtg aacgcgctcc cctccggtgg cgtggtcaaa ctctacagcc 34440 

aaagaacaga taatggcatt tgtaagatgt tgcacaatgg cttccaaaag gcaaacggcc 34500 

ctcacgtzcca agtggacgta aaggctaaac ccttcagggt gaatctcctc tataaacatt 34560 

ccagcacctt caaccatgcc caaataattc tcatctcgcc accttctcaa tatatctcta 34620 

agcaaatccc gaatattaag tccggccatt gtaaaaatct gctccagagc gccctccacc 34680 

ttcagcctca agcagcgaat catgattgca aaaattcagg ttcctcacag acctgtataa 34740 
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gattcaaaag 
gctgaacata 
tgacaaaaga 
ccccgatgta 
tcaggcaaag 
caggtaagct 
ggtttctgca 
ttacaacagg 
cgtaaaaaaa 
gagtcataat 
cgaccgaaat 
ataggaggta 
tgcctaggca 
cctaacagtc 
gcaccagctc 
actaaaaaat 
ctacgcccag 
cccacgttac 
ccgccctaaa 
accccctcat 



cggaacatta 
atcgtgcagg 
acccacactg 
agctttgttg 
cctcgcgcaa 
ccggaaccac 
taaacacaaa 
aaaaacaacc 
ctggtcaccg 
gtaagactcg 
agcccggggg 
taacaaaatt 
aaatagcacc 
agccttacca 
aatcagtcac 
gacgtaacgg 
aaacgaaagc 
gtaacttccc 
acctacgtca 
tatcatattg 



acaaaaatac 
tctgcacgga 
attatgacac 
catgggcggc 
aaaagaaagc 
cacagaaaaa 
ataaaataac 
cttataagca 
tgattaaaaa 
gtaaacacat 
aatacatacc 
aataggagag 
ctcccgctcc 
gtaaaaaaga 
agtgtaaaaa 
ttaaagtcca 
caaaaaaccc 
attttaagaa 
cccgccccgt 
gcttcaatcc 



cgcgatcccg 
ccagcgcggc 
gcatactcgg 
gatataaaat 
acatcgtagt 
gacaccattt 
aaaaaaacat 
taagacggac 
gcaccaccga 
caggttgatt 
cgcaggcgta 
aaaaacacat 
agaacaacat 
aaacctatta 
agggccaagt 
caaaaaacac 
acaacttcct 
aactacaatt 
tcccacgccc 
aaaataaggt 



taggtccctt 
cacttccccg 
agctatgcta 
gcaaggtgct 
catgctcatg 
ttctctcaaa 
ttaaacatta 
tacggccatg 
cagctcctcg 
catcggtcag 
gagacaacat 
aaacacctga 
acagcgcttc 
aaaaaacacc 
gcagagcgag 
ccagaaaacc 
caaatcgtca 
cccaacacat 
cgcgccacgt 
atattattga 



cgcagggcca 
ccaggaacct 
accagcgtag 
gctcaaaaaa 
cagataaagg 
catgtctgcg 
gaagcctgtc 
ccggcgtgac 
gtcatgtccg 
tgctaaaaag 
tacagccccc 
aaaaccctcc 
acagcggcag 
actcgacacg 
tatatatagg 
gcacgcgaac 
cttccgtttt 
acaagttact 
cacaaactcc 
tgatg 



34800 
34860 
34920 
34980 
35040 
35100 
35160 
35220 
35280 
35340 
35400 
35460 
35520 
35580 
35640 
35700 
35760 
35820 
35880 
35935 



<210> 10 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> NSsuboptmut 



<400> 10 

gccaccatgg 

atcaccagcc 

accgctaccc 

ggagccggaa 

gtggatcagg 

acctgtggaa 

cgcaggggcg 

agcagcggag 

gtgtgtacca 

accatgcgca 

caggtggctc 

tacgccgctc 

ttcggcgctt 

accatcacca 

ggctgcagcg 

accaccatcc 

gtggtgctgg 

gaggtggccc 

gccatccgcg 

gctgccaagc 

tcagtgatcc 

tacaccggag 

ttcagcctgg 

aggagccaga 

cctggcgaaa 



cccccatcac 
tgaccggacg 
agagcttcct 
gcaagaccct 
atctggtggg 
gcagcgacct 
attctcgcgg 
gacccctgct 
ggggcgtggc 
gccctgtgtt 
acctgcacgc 
agggctacaa 
acatgagcaa 
ccggagctcc 
gaggagccta 
tgggcattgg 
ccacagctac 
tgagcaacac 
gaggcaggca 
tgagcggact 
ccaccatcgg 
acttcgacag 
accccacctt 
ggcgcggacg 
ggccctctgg 



cgcctacagc 
cgacaagaac 
ggccacctgc 
ggccggaccc 
ctggcaggcc 
gtacctggtg 
aagcctgctg 
gtgtccttct 
caaagccgtg 
caccgacaac 
ccctaccgga 
ggtgctggtg 
ggcccatggc 
cgtgacctac 
cgacatcatc 
caccgtgctg 
ccctcctggc 
aggcgagatc 
cctgatcttc 
gggcatcaac 
cgatgtggtg 
cgtgatcgac 
caccatcgaa 
caccggaagg 
catgttcgac 



cagcagacca 
caggtggagg 
gtgaacggcg 
aagggcccta 
cctcccggag 
acacgccacg 
agccctaggc 
ggccatgccg 
gattttgtgc 
agctctcccc 
tctggcaaga 
ctgaacccca 
atcgacccca 
agcacctacg 
atctgcgacg 
gatcaggccg 
agcgtgaccg 
cccttctacg 
tgccacagca 
gccgtggcct 
gtggtggcca 
tgcaacacct 
accaccaccg 
ggcaggcgcg 
agcagcgtgc 



ggggcctgct 
gagaggtgca 
tgtgctggac 
tcacccagat 
ccaggagcct 
ccgatgtgat 
ccgtgagcta 
tgggcatttt 
ccgtggaaag 
ctgccgtgcc 
gcaccaaggt 
gcgtggccgc 
acatccgcac 
gcaagttcct 
agtgccacag 
aaacagctgg 
tgccccatcc 
gcaaggccat 
agaagaagtg 
actacagggg 
ccgacgccct 
gcgtgaccca 
tgcctcagga 
gaatttatcg 
tgtgcgagtg 



gggctgcatc 
ggtggtgagc 
cgtgtaccac 
gtacaccaat 
gacaccctgt 
ccccgtgagg 
cctgaagggc 
tcgcgctgcc 
catggagacc 
ccaatcattc 
gcccgctgcc 
taccctgggc 
aggcgtgcgc 
ggccgatgga 
caccgacagc 
agccaggctg 
caatatcgag 
ccccatcgag 
cgacgagctg 
cctggacgtg 
gatgacaggc 
gaccgtggac 
tgctgtgagc 
ctttgtgacc 
ctacgacgct 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
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ggctgcgctt ggtacgagct gacacccgct gaaaccagcg tgcgcctgcg cgcttatctg 1560 

aatacccctg gcctgcccgt gtgtcaggac cacctggagt tctgggagag cgtgttcaca 1620 

ggactgaccc acatcgacgc ccatttcctg agccagacca agcaggctgg cgacaacttc 1680 

ccctatctgg tggcctatca ggccaccgtg tgtgctaggg cccaagctcc acctccttca 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccctacccct 1800 

ctgctgtacc gcctgggagc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgctgatctg gaagtggtga ccagcacctg ggtgctggtg 1920 

ggaggcgtgc tggccgctct ggctgcctac tgcctgacca ccggaagcgt ggtgatcgtg 1980 

ggacgcatca tcctgagcgg aaggcccgct atcgtgcccg atcgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgtgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgaac agttcaagca gaaggccctg ggcctgctgc agacagccac caaacaggcc 2160 

gaagctgccg ctcccgtggt ggaaagcaag tggagggccc tggagacctt ctgggctaag 2220 

cacatgtgga acttcatctc tggcatccag tacctggccg gactgagcac cctgcctggc 2280 

aaccccgcta tcgccagcct gatggccttc accgctagca tcacctctcc cctgaccacc 2340 

cagagcaccc tgctgttcaa cattctgggc ggatgggtgg ccgctcagct ggcccctcct 2400 

tcagctgctt ctgcctttgt gggcgctggc attgccggag ccgctgtggg cagcattggc 2460 

ctgggcaaag tgctggtgga tattctggct ggctatggcg ctggcgtggc cggagccctg 2520 

gtggccttca aggtgatgag cggagagatg cccagcaccg aggacctggt gaacctgctg 2580 

cctgccattc tgagccctgg agccctggtg gtgggcgtgg tgtgtgctgc cattctgagg 2640 

cgccatgtgg gacccggaga gggcgctgtg cagtggatga accgcctgat cgccttcgcc 2700 

tctcgcggaa accacgtgag ccctacccac tacgtgcctg agagcgacgc cgctgccagg 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac accctgcagc ggaagctggc tgagggacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccaactg 2940 

cctggcgtgc ccttcttctc atgccagcgc ggatacaagg gcgtgtggag gggcgatggc 3000 

atcatgcaga ccacctgtcc ctgcggagcc cagatcacag gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccctaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggaccctg cacacccagc cctgctccca actacagcag ggccctgtgg 3180 

agggtggctg ccgaggagta cgtggaggtg accagggtgg gagacttcca ctacgtgacc 3240 

ggaatgacca ccgacaacgt gaagtgtccc tgtcaggtgc ccgctcccga attttttacc 3300 

gaagtggatg gcgtgcgcct gcatcgctat gcccctgcct gtaggcccct gctgcgcgaa 3360 

gaagtgacct tccaggtggg cc tgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cctgagcccg atgtggccgt gctgaccagc atgctgaccg accccagcca catcacagcc 3480 

gaaaccgcta aaaggcgcct ggccaggggc tctcctccaa gcctggcctc aagcagcgct 3540 

agccagctgt ctgctcccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgct 3780 

gccatgccca tctgggctag acctgattac aaccctcccc tgctggagag ctggaaggac 3840 

cctgattacg tgcctccagt ggtgcatggc tgtcctctgc ctcccattaa agcccctcct 3900 

attccacctc ctaggcgcaa aaggaccgtg gtgctgacag aaagcagcgt gagctctgct 3960 

ctggccgaac tggccaccaa gacctttggc agcagcgaga gctctgccgt ggacagcgga 4020 

acagccaccg ctctgcctga ccaggccagc gacgacggcg ataagggcag cgatgtggag 4080 

agctatagca gcatgcctcc cctggaaggc gaacctggcg atcccgatct gagcgatggc 4140 

agctggagca ccgtgagcga agaggccagc gaggacgtgg tgtgttgcag catgagctac 4200 

acctggacag gcgctctgat cacaccctgc gctgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gaggcaccac aacatggtgt acgccaccac cagcaggtct 4320 

gccggactga ggcagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgatgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgctc gcctgatcgt gttccccgat 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgcctcag 4740 

gtggtgatgg gctcaagcta cggcttccag tacagccctg gccagcgcgt ggagttcctg 4800 
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gtgaacacct 
gacagcaccg 
ctggcccctg 
cctctgacca 
ctgaccacca 
gctgccaagc 
gaaagcgctg 
aggtactctg 
agctgctcaa 
acacgcgatc 
gtgaacagct 
ctgatgaccc 
tgccagattt 
cgcctgcacg 
gtggccagct 
aggagcgtga 
ctgttcaact 
ctggacctga 
tctcgcgctc 
atctacctgc 



ggaagagcaa 
tgaccgagaa 
aggccaggca 
acagcaaggg 
gctgtggcaa 
tgcaggactg 
gcacccagga 
cccctcccgg 
gcaacgtgag 
ccaccacccc 
ggctgggcaa 
acttcttcag 
acggcgcttg 
gcctgtctgc 
gtctgcgcaa 
gggctaggct 
gggccgtgaa 
gcggatggtt 
gccctcgctg 
tgcccaaccg 



gaagaacccc 
cgacatccgc 
ggccatcaag 
acagaactgc 
caccctgacc 
caccatgctg 
agatgctgcc 
agacccccct 
cgtggctcac 
tctggctcgc 
catcatcatg 
catcctgctg 
ctacagcatc 
cttcagcctg 
actgggcgtg 
gctgagccag 
gaccaagctg 
cgtggctggc 
gttcatgctg 
ctaaa 



atgggcttca 
gtggaggaga 
agcctgaccg 
ggatacaggc 
tgctacctga 
gtgaacgccg 
agcctgcgcg 
cagcccgaat 
gacgccagcg 
gctgcctggg 
tacgccccta 
gctcaggagc 
gagcccctgg 
cacagctaca 
cctcctctgc 
ggaggcaggg 
aagctgaccc 
tacagcggag 
tgcctgctgc 



gctacgacac 
gcatctacca 
agcgcctgta 
gctgtagggc 
aggccagcgc 
ctggcctggt 
tgttcaccga 
acgacctgga 
gaaagcgcgt 
aaaccgctcg 
ccctgtgggc 
agctggagaa 
acctgcccca 
gccctggcga 
gcgtgtggag 
ccgctacctg 
ctatccctgc 
gcgacatcta 
tgctgagcgt 



acgctgcttc 
gtgctgcgac 
catcggaggc 
ctctggcgtg 
tgcctgtcgc 
ggtgatttgt 
ggccatgacc 
gctgatcacc 
gtactacctg 
ccatacaccc 
tcgcatgatc 
ggccctggac 
aatcatcgag 
aattaatcgc 
gcatagggct 
tggaaagtac 
cgctagccag 
ccacagcctg 
gggcgtgggc 



4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
5965 



<210> 11 
<211> 5965 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Chimeric NSsuboptmut 



<400> 11 

gccaccatgg cccccatcac cgcctacagc cagcagaccc gcggcctgct gggctgcatc 

atcaccagcc tgaccggccg cgacaagaac caggtggagg gcgaggtgca ggtggtgagc 120 

accgccaccc agagcttcct ggccacctgc gtgaacggcg tgtgctggac cgtgtaccac 180 

ggcgccggca gcaagaccct ggccggcccc aagggcccca tcacccagat gtacaccaac 240 

gtggaccagg acctggtggg ctggcaggcc ccccccggcg cccgcagcct gaccccctgc 300 

acctgcggca gcagcgacct gtacctggtg acccgccacg ccgacgtgat ccccgtgcgc 360 

cgccgcggcg acagccgcgg cagcctgctg agcccccgcc ccgtgagcta cctgaagggc 420 

agcagcggcg gccccctgct gtgccccagc ggccacgccg tgggcatctt ccgcgccgcc 480 

gtgtgcaccc gcggcgtggc caaggccgtg gacttcgtgc ccgtggagag catggagacc 540 

accatgcgca gccccgtgtt caccgacaac agcagccccc ccgccgtgcc ccagagcttc 600 

caggtggccc acctgcacgc ccccaccggc agcggcaaga gcaccaaggt gcccgccgcc 660 

tacgccgccc agggctacaa ggtgctggtg ctgaacccca gcgtggccgc caccctgggc 720 

ttcggcgcct acatgagcaa ggcccacggc atcgacccca acatccgcac cggcgtgcgc 780 
accatcacca ccggcgcccc cgtgacctac agcacctacg gcaagttcct ggccgacggc 
ggctgcagcg gcggcgccta cgacatcatc atctgcgacg agtgccacag caccgacagc 
accaccatcc tgggcatcgg caccgtgctg gaccaggccg agaccgccgg cgcccgcctg 
gtggtgctgg ccaccgccac cccccccggc agcgtgaccg tgccccaccc caacatcgag 
gaggtggccc tgagcaacac cggcgagatc cccttctacg gcaaggccat ccccatcgag 

gccatccgcg gcggccgcca cctgatcttc tgccacagca agaagaagtg cgacgagctg 1140 

gccgccaagc tgagcggcct gggcatcaac gccgtggcct actaccgcgg cctggacgtg 1200 

agcgtgatcc ccaccatcgg cgacgtggtg gtggtggcca ccgacgccct gatgaccggc 1260 

tacaccggcg acttcgacag cgtgatcgac tgcaacacct gcgtgaccca gaccgtggac 1320 

ttcagcctgg accccacctt caccatcgag accaccaccg tgccccagga cgccgtgagc 1380 

cgcagccagc gccgcggccg caccggccgc ggccgccgcg gcatctaccg cttcgtgacc 1440 

cccggcgagc gccccagcgg catgttcgac agcagcgtgc tgtgcgagtg ctacgacgcc 1500 



60 



840 
900 
960 
1020 
1080 
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ggctgcgcct ggtacgagct gacccccgcc gagaccagcg tgcgcctgcg cgcctacctg 1560 

aacacccccg gcctgcccgt gtgccaggac cacctggagt tctgggagag cgtgttcacc 1620 

ggcctgaccc acatcgacgc ccacttcctg agccagacca agcaggccgg cgacaacttc 1680 

ccctacctgg tggcctacca ggccaccgtg tgcgcccgcg cccaggcccc cccccccagc 1740 

tgggaccaga tgtggaagtg cctgatccgc ctgaagccca ccctgcacgg ccccaccccc 1800 

ctgctgtacc gcctgggcgc cgtgcagaac gaggtgaccc tgacccaccc catcaccaag 1860 

tacatcatgg cctgcatgag cgccgacctg gaggtggtga ccagcacctg ggtgctggtg 1920 

ggcggcgtgc tggccgccct ggccgcctac tgcctgacca ccggcagcgt ggtgatcgtg 1980 

ggccgcatca tcctgagcgg ccgccccgcc atzcgtgcccg accgcgagtt cctgtaccag 2040 

gagttcgacg agatggagga gtgcgccagc cacctgccct acatcgagca gggcatgcag 2100 

ctggccgagc agttcaagca gaaggccctg ggcctgctgc agaccgccac caagcaggcc 2160 

gaggccgccg cccccgtggt ggagagcaag tggcgcgccc tggagacctt ctgggccaag 2220 

cacatgtgga acttcatcag cggcatccag tacctggccg gcctgagcac cctgcccggc 2280 

aaccccgcca tcgccagcct gatggccttc accgccagca tcaccagccc cctgaccacc 2340 

cagagcaccc tgctgttcaa catcctgggc ggctgggtgg ccgcccagct ggcccccccc 2400 

agcgccgcca gcgccttcgt gggcgccggc atcgccggcg ccgccgtggg cagcatcggc 2460 

ctgggcaagg tgctggtgga catcctggcc ggctacggcg ccggcgtggc cggcgccctg 2520 

gtggccttca aggtgatgag cggcgagatg cccagcaccg aggacctggt gaacctgctg 2580 

cccgccatcc tgagccccgg cgccctggtg gtgggcgtgg tgtgcgccgc catcctgcgc 2640 

cgccacgtgg gccccggcga gggcgccgtg cagtggatga accgcctgat cgccttcgcc 2700 

agccgcggca accacgtgag ccccacccac tacgtgcccg agagcgacgc cgccgcccgc 2760 

gtgacccaga tcctgagcag cctgaccatc acccagctgc tgaagcgcct gcaccagtgg 2820 

atcaacgagg actgcagcac cccctgcagc ggcagctggc tgcgcgacgt gtgggactgg 2880 

atctgcaccg tgctgaccga cttcaagacc tggctgcaga gcaagctgct gccccagctg 2940 

cccggcgtgc ccttcttcag ctgccagcgc ggctacaagg gcgtgtggcg cggcgacggc 3000 

atcatgcaga ccacctgccc ctgcggcgcc cagatcaccg gccacgtgaa gaacggcagc 3060 

atgcgcatcg tgggccccaa gacctgcagc aacacctggc acggcacctt ccccatcaac 3120 

gcctacacca ccggcccctg cacccccagc cccgccccca actacagccg cgccctgtgg 3180 

cgcgtggccg ccgaggagta cgtggaggtg acccgcgtgg gcgacttcca ctacgtgacc 3240 

ggcatgacca ccgacaacgt gaagtgcccc tgccaggtgc ccgcccccga gttcttcacc 3300 

gaggtggacg gcgtgcgcct gcaccgctac gcccccgcct gccgccccct gctgcgcgag 3360 

gaggtgacct tccaggtggg cctgaaccag tacctggtgg gcagccagct gccctgcgag 3420 

cccgagcccg acgtggccgt gctgaccagc atgctgaccg accccagcca catcaccgcc 3480 

gagaccgcca agcgccgcct ggcccgcggc agccccccca gcctggccag cagcagcgcc 3540 

agccagctga gcgcccccag cctgaaggcc acctgcacca cccaccacgt gagccccgac 3600 

gccgacctga tcgaggccaa cctgctgtgg cgccaggaga tgggcggcaa catcacccgc 3660 

gtggagagcg agaacaaggt ggtggtgctg gacagcttcg accccctgcg cgccgaggag 3720 

gacgagcgcg aggtgagcgt gcccgccgag atcctgcgca agagcaagaa gttccccgct 3780 

gccatgccca tctgggctag acctgattac aaccctcccc tgctggagag ctggaaggac 3840 

cctgattacg tgcctccagt ggtgcatggc tgtcctctgc ctcccattaa agcccctcct 3900 

attccacctc ctaggcgcaa aaggaccgtg gtgctgacag aaagcagcgt gagctctgct 3960 

ctggccgaac tggccaccaa gacctttggc agcagcgaga gctctgccgt ggacagcgga 4020 

acagccaccg ctctgcctga ccaggccagc gacgacggcg ataagggcag cgatgtggag 4080 

agctatagca gcatgcctcc cctggaaggc gaacctggcg atcccgatct gagcgatggc 4140 

agctggagca ccgtgagcga agaggccagc gaggacgtgg tgtgttgcag catgagctac 4200 

acctggacag gcgctctgat cacaccctgc gctgccgagg agagcaagct gcccatcaac 4260 

gccctgagca acagcctgct gaggcaccac aacatggtgt acgccaccac cagcaggtct 4320 

gccggactga ggcagaagaa ggtgaccttc gaccgcctgc aggtgctgga cgaccactac 4380 

cgcgatgtgc tgaaggagat gaaggccaag gccagcaccg tgaaggccaa gctgctgagc 4440 

gtggaggagg cctgcaagct gacccccccc cacagcgcca agagcaagtt cggctacggc 4500 

gccaaggacg tgcgcaacct gagcagcaag gccgtgaacc acatccacag cgtgtggaag 4560 

gacctgctgg aggacaccgt gacccccatc gacaccacca tcatggccaa gaacgaggtg 4620 

ttctgcgtgc agcccgagaa gggcggccgc aagcccgccc gcctgatcgt gttccccgac 4680 

ctgggcgtgc gcgtgtgcga gaagatggcc ctgtacgacg tggtgagcac cctgccccag 4740 

gtggtgatgg gcagcagcta cggcttccag tacagccccg gccagcgcgt ggagttcctg 4800 



62/64 



WO 03/031588 



PCT/US02/32512 



gtgaacacct ggaagagcaa gaagaacccc atgggcttca gctacgacac ccgctgcttc 4860 

gacagcaccg tgaccgagaa cgacatccgc gtggaggaga gcatctacca gtgctgcgac 4920 

ctggcccccg aggcccgcca ggccatcaag agcctgaccg agcgcctgta catcggcggc 4980 

cccctgacca acagcaaggg ccagaactgc ggctaccgcc gctgccgcgc cagcggcgtg 5040 

ctgaccacca gctgcggcaa caccctgacc tgctacctga aggccagcgc cgcctgccgc 5100 

gccgccaagc tgcaggactg caccatgctg gtgaacgccg ccggcctggt ggtgatctgc 5160 

gagagcgccg gcacccagga ggacgccgcc agcctgcgcg tgttcaccga ggccatgacc 5220 

cgctacagcg ccccccccgg cgaccccccc cagcccgagt acgacctgga gctgatcacc 5280 

agctgcagca gcaacgtgag cgtggcccac gacgccagcg gcaagcgcgt gtactacctg 5340 

acccgcgacc ccaccacccc cctggcccgc gccgcctggg agaccgcccg ccacaccccc 5400 

gtgaacagct ggctgggcaa catcatcatg tacgccccca ccctgtgggc ccgcatgatc 5460 

ctgatgaccc acttcttcag catcctgctg gcccaggagc agctggagaa ggccctggac 5520 

tgccagatct acggcgcctg ctacagcatc gagcccctgg acctgcccca gatcatcgag 5580 

cgcctgcacg gcctgagcgc cttcagcctg cacagctaca gccccggcga gatcaaccgc 5640 

gtggccagct gcctgcgcaa gctgggcgtg ccccccctgc gcgtgtggcg ccaccgcgcc 5700 

cgcagcgtgc gcgcccgcct gctgagccag ggcggccgcg ccgccacctg cggcaagtac 5760 

ctgttcaact gggccgtgaa gaccaagctg aagctgaccc ccatccccgc cgccagccag 5820 

ctggacctga gcggctggtt cgtggccggc tacagcggcg gcgacatcta ccacagcctg 5880 

agccgcgccc gcccccgctg gttcatgctg tgcctgctgc tgctgagcgt gggcgtgggc 5940 

atctacctgc tgcccaaccg ctaaa 5965 

<210> 12 
<211> 10 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Ribosome binding site 



10 



<400> 12 
gccaccaugg 

<210> 13 
<211> 49 
<212> RNA 

<213> Artificial Sequence 
<220> 

<223> Synthetic polyadenylation signal 
<400> 13 

aauaaaagau cuuuauuuuc auuagaucug uguguugguu uuuugugug 49 

<210> 14 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Additional nucleotides present in pVIJns-NS 
<400> 14 

tctagagcgt ttaaaccctt aattaagg 28 
<210> 15 
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<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Additional nucleotides present in pVUns-NSOPTmut 
<400> 15 

tttaaatgtt taaac 15 

<210> 16 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 16 

tcgaatcgat acgcgaacct acgc 24 

<210> 17 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Oligonucleotide primer 
<400> 17 

tcgacgtgtc gacttcgaag cgcacaccaa aaacgtc 37 
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